[Numpy-discussion] array.sum() slower than expected along some array axes?
Sat Feb 3 20:28:19 CST 2007
On 2/3/07, Charles R Harris wrote:
On 2/3/07, Stephen Simmons wrote:
> > Hi,
> > Does anyone know why there is an order of magnitude difference
> > in the speed of numpy's array.sum() function depending on the axis
> > of the matrix summed?
> > To see this, import numpy and create a big array with two rows:
> > >>> import numpy
> > >>> a = numpy.ones([2,1000000], 'f4')
> > Then using ipython's timeit function:
> > Time (ms)
> > sum(a) 20
> > a.sum() 9
> > a.sum(axis=1) 9
> > a.sum(axis=0) 159
> > numpy.dot(numpy.ones(a.shape[0], a.dtype), a) 15
> > This last one using a dot product is functionally equivalent
> > to a.sum(axis=0), suggesting that the slowdown is due to how
> > indexing is implemented in array.sum().
> >
> In this case it is expected. There are inner and outer loops, in the slow
> case the inner loop with its extra code is called 1000000 times, in the fast
> case, twice. On the other hand, note this:
>
> In [10]: timeit a[0,:] + a[1,:]
> 100 loops, best of 3: 19.7 ms per loop
>
> Which has only one loop. Caching could also be a problem, but in this case
> it is dominated by loop overhead.
>
PS, I think this indicate that the code would run faster in this case if it
accumulated along the last axis, one at a time for each leading index. I
suspect that the current implementation accumulates down the first axis,
then repeats for each of the last indices. This shows that rearranging the
way the accumulation is done could be a big gain, especially if the largest
axis is chosen.
Chuck
