[Numpy-discussion] profiling line by line
Robert Kern
robert.kern@gmail....
Thu Sep 18 14:54:13 CDT 2008
On Thu, Sep 18, 2008 at 10:03, Ondrej Certik <ondrej@certik.cz> wrote:
> Anyway, back to work: Robert K., I noticed that if I profile some
> function, I get results like this for example:
>
> ----
> 40 3072 46952 6.6 [x,w] = p_roots(n)
> 41 3072 24192 3.4 x = real(x)
> 42 3072 34470 4.8 ainf, binf = map(isinf,(a,b))
> 43 3072 6565 0.9 if ainf or binf:
> 44 raise ValueError,
> "Gaussian quadrature is only available for " \
> 45 "finite limits."
> 46 3072 5093 0.7 if not reference:
> 47 x = (b-a)*(x+1)/2.0 + a
> 48 3072 594190 83.5 return
> (b-a)/2.0*sum(w*func(x,*args),0)
> ---
>
> Then if I turn profiling of the func() method, I get this:
>
> --------------------
> 40 3072 46999 4.6 [x,w] = p_roots(n)
> 41 3072 24313 2.4 x = real(x)
> 42 3072 34327 3.4 ainf, binf = map(isinf,(a,b))
> 43 3072 6190 0.6 if ainf or binf:
> 44 raise ValueError,
> "Gaussian quadrature is only available for " \
> 45 "finite limits."
> 46 3072 4918 0.5 if not reference:
> 47 x = (b-a)*(x+1)/2.0 + a
> 48 3072 906876 88.6 return
> (b-a)/2.0*sum(w*func(x,*args),0)
> -----------
>
> So the timing raises a lot. For obvious reasons, that's the overhead
> of the profiler. But the problem is that then the timings just don't
> fit, e.g. if I sum the total time spent in subfunctions, it doesn't
> account for all the time printed on the respective line in the parent
> function.
>
> I don't know if there is any way to fix it, or even worth fixing. So I
> guess one should just use the profiling info to determine which line
> to fix.
So here's what going on: I'm being clever (and possibly too clever).
When tracing is enabled, Python will call my tracing function just
before each new line gets executed. If tracing isn't enabled for this
function, I return. Otherwise, I grab the current time. Then, I look
for the last line and time I recorded for this function. I look up the
accumulator for the (code, old line) pair and record the time delta.
Then I grab the current time *again*, and store the current line and
this new time for the next go 'round.
This way, I exclude most of the time spent inside the profiler itself
and just record the time being spent in the code. The total time
reported is just a sum of the recorded times, not the sum of
wall-clock times spent in the function. Now, this does break down in
your use case where you are profiling both the caller and callee and
trying to determine how much of a line's time is being spent just by
calling the function.
I could record wall-clock times between the start and end of a
function call, but I think that's fragile. For example, suppose you
are profiling A() and B() but not C(), and both A() and C() call B().
Using the wall-clock time spent in B() will tell you that you spent
more time in B() than the appropriate line (or lines!) that called it
in A().
I think the most robust way to figure this out is to rewrite your code
to pull out such calls onto their own lines. This is like breaking up
your functions into tiny little one-liners in order to use cProfile
only it doesn't suck *nearly* as much.
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
-- Umberto Eco
More information about the Numpy-discussion
mailing list