[Numpy-discussion] efficient summation

Stephen Walton stephen.walton at csun.edu
Wed Sep 1 16:45:10 CDT 2004


On Wed, 2004-09-01 at 14:51, Darren Dale wrote:
> I am trying to effieciently sum over a subset of the elements of a 
> matrix. In Matlab, this could be done like:
> a=[1,2,3,4,5,6,7,8,9,10]
> b = [1,0,0,0,0,0,0,0,0,1]
> res=sum(a(b))

This needs to be sum(a(find(b)).


> Is there anything similar in numarray (or numeric)? I thought masked 
> arrays looked promising, but I find that masking 90% of the elements 
> results in marginal speedups (~5%, instead of 90%) over the unmasked
array.

I don't think that's bad, and in fact it is substantially better than
MATLAB.  Consider the following clip from MATLAB Version 7:

>> a=randn(10000000,1);
>> t=cputime;sum(a);e=cputime()-t

e =

    0.1300

>> f=rand(10000000,1)<0.1;
>> t=cputime;sum(a(find(f)));e=cputime()-t

e =

    0.2200

In other words, masking off all but 10% of the elements of a 1e7 element
array actually increased the CPU time required for the sum by about 50%.

In addition, I doubt you can measure CPU time for only a 10 element
array.  I had to use 1e7 elements in MATLAB on a 2.26MHz P4 just to get
the CPU time large enough to measure reasonably accurately.  Also recall
that it is a known characteristic of numarray that it is slow on small
arrays in general.

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://projects.scipy.org/pipermail/numpy-discussion/attachments/20040901/2f2b7245/attachment.bin 


More information about the Numpy-discussion mailing list