[Numpy-discussion] Sum, multiply are slow ?
Thu Jul 12 01:33:03 CDT 2007
Travis Oliphant wrote:
> David Cournapeau wrote:
>> While profiling some code, I noticed that sum in numpy is kind of
>> slow once you use axis argument:
> Yes, this is expected because when using an access argument, the
> following two things can happen
> 1) You may be skipping over large chunks of memory to get to the next
> available number and out-of-cache memory access is slow.
> 2) You have to allocate a result array.
>> import numpy as N
>> a = N.random.randn(1e5, 30)
>> %timeit N.sum(a) #-> 26.8ms
>> %timeit N.sum(a, 1) #-> 65.5ms
>> %timeit N.sum(a, 0) #-> 141ms
>> Now, if I use some tricks, I get:
>> %timeit N.sum(a) #-> 26.8 ms
>> %timeit N.dot(a, N.ones(a.shape, a.dtype)) #-> 11.3ms
>> %timeit N.dot(N.ones((1, a.shape), a.dtype), a) #-> 15.5ms
>> I realize that dot uses optimized libraries (atlas in my case) and all,
>> but is there any way to improve this situation ?
> Sum does *not* use an optimized library so it is not too surprising that
> you can get speed-ups using ATLAS.
I understand that there is no optimization going on with sum or
multiply. This was just to have a comparison (this kind of things varies
*a lot* accross CPU of the same architecture).
> It would be nice to do something to
> optimize the reduction functions in NumPy, but nobody has come forward
> with suggestions yet.
So this is possible to improve things ? I noticed that sum/multiply and
co are using reduction functions. Should I follow the same scheme than
what I did for clip (following dot related optimization, basically) ?
More information about the Numpy-discussion