[Numpy-discussion] Fast histogram

Zachary Pincus zachary.pincus@yale....
Thu Apr 17 11:46:27 CDT 2008


Hi,

> How about a combination of sort, followed by searchsorted right/left  
> using the bin boundaries as keys? The difference of the two  
> resulting vectors is the bin value. Something like:
>
> In [1]: data = arange(100)
>
> In [2]: bins = [0,10,50,70,100]
>
> In [3]: lind = data.searchsorted(bins)
>
> In [4]: print lind[1:] - lind[:-1]
> [10 40 20 30]
>
> This won't be as fast as a c implementation, but at least avoids the  
> loop.

This is, more or less, what the current numpy.histogram does, no? I  
was hoping to avoid the O(n log n) sorting, because the image arrays  
are pretty big, and numpy.histogram doesn't get close to video rate  
for me...

Perhaps, though, some of the slow-down from numpy.histogram is from  
other overhead, and not the sorting. I'll try this, but I think I'll  
probably just have to write the c loop...

Zach


More information about the Numpy-discussion mailing list