[off topic] Re: [Numpy-discussion] numarray speed - PySequence_GetItem
tim.hochberg at cox.net
Tue Jun 29 10:12:38 CDT 2004
Todd Miller wrote:
>On Mon, 2004-06-28 at 17:14, Sebastian Haase wrote:
>>My original question was just this: Does anyone know why numarray is maybe 10
>>times slower that Numeric with that particular code segment
>Well, the short answer is probably: no.
>Looking at the numarray sequence protocol benchmarks in
>Examples/bench.py, and looking at what wxPython is probably doing
>(fetching a 1x2 element array from an Nx2 and then fetching 2 numerical
>values from that)... I can't fully nail it down. My benchmarks show
>that numarray is 4x slower for fetching the two element array but only
>1.1x slower for the two numbers; that makes me expect at most 4x
>Noticing the 50k __del__ calls in your profile, I eliminated __del__
>(breaking numarray) to see if that was the problem; the ratios changed
>to 2.5x slower and 0.9x slower (actually faster) respectively.
This reminds me, when profiling bits and pieces of my code I've often
noticed that __del__ chews up a large chunk of time. Is there any
prospect of this being knocked down at all, or is it inherent in the
structure of numarray?
>The large number of "Check" routines preceding the numarray path (I
>count 7 looking at my copy of wxPython) has me a little concerned. I
>think those checks are more expensive for numarray because it is a new
If that's really a significant slowdown, the culprit's are likely
PyTuple_Check, PyList_Check and wxPySwigInstance_Check.
PySequence_Check appears to just be pointer compares and shouldn't
invoke any new style class machinery. PySequence_Length calls sq_length,
but appears also to not involve new class machinery. Of these, I think
PyTuple_Check and PyList_Check could be replaced with PyTuple_CheckExact
and PyList_CheckExact. This would slow down people using subclasses of
tuple/list, but speed everyone else up since the latter pair of
functions are just pointer compares. I think the former group is a very
small minority, possibly nonexistent, minority, so this would probably
I don't see any easy/obvious ways to speed up wxPySwigInstance_Check,
but I believe that wxPoints now obey the PySequence protocol, so I think
that the whole wxPySwigInstance_Check branch could be removed. To get
that into wxPython you'd probably have to convince Robin that it
wouldn't hurt the speed of list of wxPoints unduly.
Wait... If the above doesn't work, I think I do have a way that might
work for speeding the check for a wxPoint. Before the loop starts, get a
pointer to wx.core.Point (the class for wxPoints these days) and call it
wxPoint_Type. Then just use for the check:
o->ob_type == &wxPoint_Type
Worth a try anyway.
Unfortunately, I don't have any time to try any of this out right now.
Chris, are you feeling bored?
>I have a hard time imagining a 10x difference overall,
>but I think Python does have to traverse the numarray class hierarchy
>rather than do a type pointer comparison so they are more expensive.
More information about the Numpy-discussion