[Numpy-discussion] Speeding up wxPython/numarray

Todd Miller jmiller at stsci.edu
Thu Jul 1 09:59:01 CDT 2004

On Wed, 2004-06-30 at 19:00, Tim Hochberg wrote: 
> By this do you mean the "#if PY_VERSION_HEX >= 0x02030000 " that is 
> wrapped around _ndarray_item? If so, I believe that it *is* getting 
> compiled, it's just never getting called.
> What I think is happening is that the class NumArray inherits its 
> sq_item from PyClassObject. In particular, I think it picks up 
> instance_item from Objects/classobject.c. This appears to be fairly 
> expensive and, I think, ends up calling tp_as_mapping->mp_subscript. 
> Thus, _ndarray's sq_item slot never gets called. All of this is pretty 
> iffy since I don't know this stuff very well and I didn't trace it all 
> the way through. However, it explains what I've seen thus far.
> This is why I ended up using the horrible hack. I'm resetting NumArray's 
> sq_item to point to _ndarray_item instead of instance_item.  I believe 
> that access at the python level goes through mp_subscrip, so it 
> shouldn't be affected, and only objects at the C level should notice and 
> they should just get the faster sq_item. You, will notice that there are 
> an awful lot of I thinks in the above paragraphs though...

Ugh...  Thanks for explaining this.

> >>I then optimized _ndarray_item (code 
> >>at end). This halved the execution time of my arbitrary benchmark. This 
> >>trick may have horrible, unforseen consequences so use at your own risk.
> >>    
> >>
> >
> >Right now the sq_item hack strikes me as somewhere between completely
> >unnecessary and too scary for me!  Maybe if python-dev blessed it.
> >  
> >
> Yes, very scary. And it occurs to me that it will break subclasses of 
> NumArray if they override __getitem__. When these subclasses are 
> accessed from C they will see nd_array's sq_item instead of the 
> overridden getitem.   However,  I think I also know how to fix it. But 
> it does point out that it is very dangerous and there are probably dark 
> corners of which I'm unaware. Asking on Python-List or PyDev would 
> probably be a good idea.
> The nonscary, but painful, fix would to rewrite NumArray in C.

Non-scary to whom?

> >This optimization looks good to me.
> >  
> >
> Unfortunately, I don't think the optimization to sq_item will affect 
> much since NumArray appears to override it with
> >>Finally I commented out the __del__  method numarraycore. This resulted 
> >>in an additional speedup of 64% for a total speed up of 240%. Still not 
> >>close to 10x, but a large improvement. However, this is obviously not 
> >>viable for real use, but it's enough of a speedup that I'll try to see 
> >>if there's anyway to move the shadow stuff back to tp_dealloc.
> >>    
> >>
> >
> >FYI, the issue with tp_dealloc may have to do with which mode Python is
> >compiled in, --with-pydebug, or not.  One approach which seems like it
> >ought to work (just thought of this!) is to add an extra reference in C
> >to the NumArray instance __dict__ (from NumArray.__init__ and stashed
> >via a new attribute in the PyArrayObject struct) and then DECREF it as
> >the last part of the tp_dealloc.  
> >  
> >
> That sounds promising.

I looked at this some, and while INCREFing __dict__ maybe the right
idea,  I forgot that there *is no* Python NumArray.__init__ anymore.  

So the INCREF needs to be done in C without doing any getattrs;  this
seems to mean calling a private _PyObject_GetDictPtr function to get a
pointer to the __dict__ slot which can be dereferenced to get the

> [SNIP]
> >
> >Well, be picking out your beer.
> >  
> >
> I was only about half right, so I'm not sure I qualify...

We could always reduce your wages to a 12-pack... 


More information about the Numpy-discussion mailing list