[Numpy-discussion] Speeding up wxPython/numarray

Tim Hochberg tim.hochberg at cox.net
Wed Jun 30 12:58:17 CDT 2004


I spend some time seeing what I could do in the way of speeding up 
wxPoint_LIST_helper by tweaking the numarray code. My first suspect was 
_universalIndexing by way of _ndarray_item. However, due to some 
new-style machinations, _ndarray_item was never getting called. Instead, 
_ndarray_subscript was being called. So, I added a special case to 
_ndarray_subscript. This sped things up by 50% or so (I don't recall 
exactly). The code for that is at the end of this message; it's not 
gauranteed to be 100% correct; it's all experimental.

After futzing around some more I figured out a way to trick python into 
using _ndarray_item. I added "type->tp_as_sequence->sq_item = 
_ndarray_item;" to _ndarray new.  I then optimized _ndarray_item (code 
at end). This halved the execution time of my arbitrary benchmark. This 
trick may have horrible, unforseen consequences so use at your own risk.

Finally I commented out the __del__  method numarraycore. This resulted 
in an additional speedup of 64% for a total speed up of 240%. Still not 
close to 10x, but a large improvement. However, this is obviously not 
viable for real use, but it's enough of a speedup that I'll try to see 
if there's anyway to move the shadow stuff back to tp_dealloc.

In summary:

Version                       Time   Rel Speedup   Abs Speedup
Stock                            0.398         ----                  ----
_naarray_item mod      0.192          107%               107%
del __del__                  0.117          64%                 240%

There were a couple of other things I tried that resulted in additional 
small speedups, but the tactics I used were too horrible to reproduce 
here. The main one of interest is that all of the calls to 
NA_updateDataPtr seem to burn some time. However, I don't have any idea 
what one could do about that.

That's all for now.

-tim




static PyObject*
_ndarray_subscript(PyArrayObject* self, PyObject* key)
   
{
    PyObject *result;
#ifdef TAH
        if (PyInt_CheckExact(key)) {
            long ikey = PyInt_AsLong(key);
            long offset;
            if (NA_getByteOffset(self, 1, &ikey, &offset) < 0)
                return NULL;
            if (!NA_updateDataPtr(self))
                return NULL;
            return _simpleIndexingCore(self, offset, 1, Py_None);
        }
#endif
#if _PYTHON_CALLBACKS
    result = PyObject_CallMethod(
        (PyObject *) self, "_universalIndexing", "(OO)", key, Py_None);
#else
    result = _universalIndexing(self, key, Py_None);
#endif
    return result;
}



static PyObject *
_ndarray_item(PyArrayObject *self, int i)
{
#ifdef TAH
    long offset;
    if (NA_getByteOffset(self, 1, &i, &offset) < 0)
        return NULL;
    if (!NA_updateDataPtr(self))
        return NULL;
    return _simpleIndexingCore(self, offset, 1, Py_None);
#else
    PyObject *result;
    PyObject *key = PyInt_FromLong(i);
    if (!key) return NULL;
    result = _universalIndexing(self, key, Py_None);
    Py_DECREF(key);
    return result;
#endif
}







More information about the Numpy-discussion mailing list