can this be made faster?

Robert Kern robert.kern at
Mon Oct 9 00:40:41 CDT 2006

Daniel Mahler wrote:
> On 10/8/06, Greg Willden <gregwillden at> wrote:

>> This next one is a little closer for the case when c is not just a bunch of
>> 1's but you still have to know how the highest number in b.
>> a=array([sum(c[b==0]),  sum(c[b==1]), ... sum(c[b==N]) ] )
>> So it sort of depends on your ultimate goal.
>> Greg
>> Linux.  Because rebooting is for adding hardware.
> In my case all a, b, c are large with b and c being orders of
> magnitude lareger than a.
> b is known to contain only, but potentially any, a-indexes,  reapeated
> many times.
> c contains arbitray floats.
> essentially it is to compute class totals
> as in total[class[i]] += value[i]

In that case, a slight modification to Greg's suggestion will probably be fastest:

import numpy as np

# Set up the problem.
lena = 10
lenc = 10000
a = np.zeros(lena, dtype=float)
b = np.random.randint(lena, size=lenc)
c = np.random.uniform(size=lenc)

idx = np.arange(lena, dtype=int)[:, np.newaxis]
mask = (b == idx)
for i in range(lena):
     a[i] = c[b[i]].sum()

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

Take Surveys. Earn Cash. Influence the Future of IT
Join's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash

More information about the Numpy-discussion mailing list