[Numpy-discussion] Sum, multiply are slow ?
Thu Jul 12 01:47:05 CDT 2007
David Cournapeau wrote:
> While profiling some code, I noticed that sum in numpy is kind of
> slow once you use axis argument:
Yes, this is expected because when using an access argument, the
following two things can happen
1) You may be skipping over large chunks of memory to get to the next
available number and out-of-cache memory access is slow.
2) You have to allocate a result array.
> import numpy as N
> a = N.random.randn(1e5, 30)
> %timeit N.sum(a) #-> 26.8ms
> %timeit N.sum(a, 1) #-> 65.5ms
> %timeit N.sum(a, 0) #-> 141ms
> Now, if I use some tricks, I get:
> %timeit N.sum(a) #-> 26.8 ms
> %timeit N.dot(a, N.ones(a.shape, a.dtype)) #-> 11.3ms
> %timeit N.dot(N.ones((1, a.shape), a.dtype), a) #-> 15.5ms
> I realize that dot uses optimized libraries (atlas in my case) and all,
> but is there any way to improve this situation ?
Sum does *not* use an optimized library so it is not too surprising that
you can get speed-ups using ATLAS. It would be nice to do something to
optimize the reduction functions in NumPy, but nobody has come forward
with suggestions yet.
Thanks for the reports, though.
More information about the Numpy-discussion