[Numpy-discussion] Is sum() slow?
Travis Oliphant
oliphant at ee.byu.edu
Tue Mar 7 10:38:03 CST 2006
Mads Ipsen wrote:
>Here are some timings that puzzle me a bit. Sum the two rows of a 2xn
>matrix, where n is some large number
>python -m timeit -s "from numpy import array,sum,reshape; x =
>array([1.5]*1000); x = reshape(x,(2,500))" "x.sum(0)"
>10000 loops, best of 3: 36.2 usec per loop
>python -m timeit -s "from numpy import array,sum,reshape; x =
>array([1.5]*1000); x = reshape(x,(2,500))" "x[0] + x[1]"
>100000 loops, best of 3: 5.35 usec per loop
This is probably reasonable. There is overhead in the looping construct
(basically what happens is that the first element is copied into the
output and then a function called --- which in this case has a loop of
size 1 to compute the sum).
This is then repeated 500 times. So, you have 500 C-function pointer
calls in the first case.
In the second case you have basically a single call to the same function
where the 500-element loop is done.
I'm a little surprised that Numeric is so much faster for this case as
you show later. The sum code is actually add.reduce... which uses a
generic reduction concept in ufuncs. It has overhead over what you
might do using some less general approach.
If anyone can figure out how to make the NOBUFFER secion in
GenericReduce faster in ufuncobject.c it will be greatly welcomed.
Speed improvements are always welcome.
-Travis
