[Numpy-discussion] Fast histogram
Zachary Pincus
zachary.pincus@yale....
Thu Apr 17 11:46:27 CDT 2008
Hi,
> How about a combination of sort, followed by searchsorted right/left
> using the bin boundaries as keys? The difference of the two
> resulting vectors is the bin value. Something like:
>
> In [1]: data = arange(100)
>
> In [2]: bins = [0,10,50,70,100]
>
> In [3]: lind = data.searchsorted(bins)
>
> In [4]: print lind[1:] - lind[:-1]
> [10 40 20 30]
>
> This won't be as fast as a c implementation, but at least avoids the
> loop.
This is, more or less, what the current numpy.histogram does, no? I
was hoping to avoid the O(n log n) sorting, because the image arrays
are pretty big, and numpy.histogram doesn't get close to video rate
for me...
Perhaps, though, some of the slow-down from numpy.histogram is from
other overhead, and not the sorting. I'll try this, but I think I'll
probably just have to write the c loop...
Zach
More information about the Numpy-discussion
mailing list