[SciPy-User] line_profiler and for-loops !?
Tue Mar 16 05:27:05 CDT 2010
I think there are a couple of factors here - python for loops and if statements are bad news performance wise, so if you can remove them using cython / c you'll gain a lot. On the other hand they tend to suffer more of a performance penalty when you've got the line profiling hooks in place than, for example numpy calls which are vectorised and doing all the heavy lifting in places which aren't visible to the profiler.
You might also be able to improve performance by doing something like:
for tracki, track in enumerate(self.tracks[self.tracks_tlast == t-1]):
if tracks and tracks_tlast are numpy arrays.
Just out of interest, how many points/observations are you trying to track? I've been working on something for stitching together tracks from lists of positions as well and would be interested in comparing strategies.
--- On Tue, 16/3/10, Sebastian Haase <email@example.com> wrote:
> From: Sebastian Haase <firstname.lastname@example.org>
> Subject: [SciPy-User] line_profiler and for-loops !?
> To: "SciPy Users List" <SciPyemail@example.com>
> Received: Tuesday, 16 March, 2010, 9:40 PM
> I was starting to use Robert's line_profiler. I seems to
> work great,
> and I already found one easy way do half my execution
> But now it claims that 33% of the time is spent (directly)
> in the
> "for"-line and another 36% in a very simple "if"-line. See
> parts of
> the output here:
> Function: doTracing at line 1135
> Total time: 23.9171 s
> Line # Hits
> Time Per Hit %
> Time Line Contents
> # iterate
> over all tracks, and find close points
> 1186 3853362
> 8024186 2.1
> tracki,track in enumerate(self.tracks):
> 1187 3853063
> 8639273 2.2
> self.tracks_tLast[tracki] == t-1:
> # track went on until t-1 (so far)
> # -- otherwise, skip "old" tracks
> 1190 62150
> 130277 2.1
> pi_t_1 = track[-1] # index in last time
> (The object of the function is to connect closest points
> found in an
> image sequence into tracks connecting the points by
> shortest steps.)
> Anyhow, my question is, is this just an artifact of
> line_profiler, or
> is the fact that those two lines are hit almost 4e6 times
> resulting in more than 50% of the time being spent here !?
> (Calculating the actual Euclidean distance matrix over all
> point pairs
> takes supposedly only 15% of the time, for comparison).
> I tried to separate out the "enumerate(self.tracks)" into a
> line before the "for"-line, but the time spent was still
> unchanged on
> the "for".
> Does this mean "python is slow" here - and I should try
> cython (which
> i have never done so far ...) ?
> Sebastian Haase
> SciPy-User mailing list
More information about the SciPy-User