[Numpy-discussion] Fancier indexing

Keith Goodman kwgoodman@gmail....
Thu May 22 11:34:46 CDT 2008


On Thu, May 22, 2008 at 9:22 AM, Robin <robince@gmail.com> wrote:
> On Thu, May 22, 2008 at 4:59 PM, Kevin Jacobs <jacobs@bioinformed.com>
> <bioinformed@gmail.com> wrote:
>> After poking around for a bit, I was wondering if there was a faster method
>> for the following:
>>
>> # Array of index values 0..n
>> items = numpy.array([0,3,2,1,4,2],dtype=int)
>>
>> # Count the number of occurrences of each index
>> counts = numpy.zeros(5, dtype=int)
>> for i in items:
>>   counts[i] += 1
>>
>> In my real code, 'items' contain up to a million values and this loop will
>> be in a performance critical area of code.  If there is no simple solution,
>> I can trivially code this using the C-API.
>
> I would use bincount:
> count = bincount(items)
> should be all you need:

I guess bincount is *little* faster:

>> items = mp.random.randint(0, 100, (1000000,))
>> timeit mp.bincount(items)
100 loops, best of 3: 4.05 ms per loop
>> items = items.tolist()
>> timeit [items.count(i) for i in range(100)]
10 loops, best of 3: 2.91 s per loop


More information about the Numpy-discussion mailing list