[Numpy-discussion] question about optimizing

Anne Archibald peridot.faceted@gmail....
Fri May 16 19:48:04 CDT 2008


2008/5/16 Brian Blais <bblais@bryant.edu>:

> I have a custom array, which contains custom objects (I give a stripped down
> example below), and I want to loop over all of the elements of the array and
> call a method of the object.  I can do it like:
>     a=MyArray((5,5),MyObject,10)
>
>     for obj in a.flat:
>         obj.update()
> but I was wondering if there is a faster way, especially if obj.update is a
> cython-ed function.  I was thinking something like apply_along_axis, but
> without having an input array at all.
> Is there a better way to do this?  While I'm asking, is there a better way
> to overload ndarray than what I am doing below?  I tried to follow code I
> found online, but the examples of this are few and far between.

Unfortunately, the loop overhead isn't very big compared to the
overhead of method dispatch, so there's no way you're going to make
things fast. For convenience you can do something like

update = vectorize(lambda object: object.update())

and then later

update(a)

But if you really want things to go quickly, I think the best approach
is to take advantage of the main feature of arrays: they hold
homogeneous items efficiently. So use the array to store the contents
of your object, and put any special behaviour in the array object. For
example, if I wanted to efficiently compute with arrays of numbers
carrying units, I would attach a unit to the array as a whole, and
have the array store doubles. With some cleverness, you can even have
accesses to the array return a freshly-created object whose contents
are based on the values you looked up in the array. But storing actual
python objects in an array is probably not a good idea.

Anne


More information about the Numpy-discussion mailing list