# [Numpy-discussion] Fast histogram

Zachary Pincus zachary.pincus@yale....
Thu Apr 17 13:11:21 CDT 2008

```Hello,

>> But even if indices = array, one still needs to do something like:
>> for index in indices: histogram[index] += 1
> numpy.bincount?

That is indeed what I was looking for! I knew I'd seen such a function.

However, the speed is a bit disappointing. I guess the sorting isn't
too much of a penalty:

def histogram(array, bins, range):
min, max = range
indices = numpy.clip(((array.astype(float) - min) * bins / (max -
min)).astype(int), 0, bins-1).flat
return numpy.bincount(indices)

import numexpr
def histogram_numexpr(array, bins, range):
min, max = range
min = float(min)
max = float(max)
indices = numexpr.evaluate('(array - min) * bins / (max - min)')
indices = numpy.clip(indices.astype(int), 0, bins-1).flat
return numpy.bincount(indices)

>>> arr.shape
(1300, 1030)

>>> timeit numpy.histogram(arr, 12, [0, 5000])
10 loops, best of 3: 99.9 ms per loop

>>>  timeit histogram(arr, 12, [0, 5000])
10 loops, best of 3: 127 ms per loop

>>> timeit histogram_numexpr(arr, 12, [0, 5000])
10 loops, best of 3: 109 ms per loop

>>>  timeit numpy.histogram(arr, 5000, [0, 5000])
10 loops, best of 3: 111 ms per loop

>>>  timeit histogram(arr, 5000, [0, 5000])
10 loops, best of 3: 127 ms per loop

>>> timeit histogram_numexpr(arr, 5000, [0, 5000])
10 loops, best of 3: 108 ms per loop

So, they're all quite close, and it seems that numpy.histogram is the
definite winner. Huh. I guess I will have to go to C or maybe weave to
get up to video-rate, unless folks can suggest some further
optimizations...

Zach
```