[Numpy-discussion] Fast histogram
Sebastian Haase
haase@msg.ucsf....
Thu Apr 17 11:38:22 CDT 2008
On Thu, Apr 17, 2008 at 6:18 PM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
>
>
> On Thu, Apr 17, 2008 at 10:02 AM, Zachary Pincus <zachary.pincus@yale.edu>
> wrote:
> > Hi folks,
> >
> > I'm working on a live-video display for some microscope control tools
> > I'm building. For this, I need a fast histogram function to work on
> > large-ish images (1000x2000 or so) at video rate, with cycles left
> > over for more interesting calculations (like autofocus).
> >
> > Now, numpy.histogram is a bit slower than I'd like, probably because
> > it's pretty general (and of course cf. the recent discussion about its
> > speed). I just need even bins within a set range. This is easy enough
> > to do with a C-extension, or perhaps even cython, but before I go
> > there, I was wondering if there's a numpy function that can help.
> >
> > Here's what I have in mind:
> >
> > def histogram(arr, bins, range):
> > min, max = range
> > indices = numpy.clip(((arr.astype(float) - min) * bins / (max -
> > min)).astype(int), 0, bins-1)
> > histogram = numpy.zeros(bins, numpy.uint32)
> > for i in indices:
> > histogram[i] += 1
> >
> > Now, clearly, the last loop is what needs speeding up. Are there any
> > numpy functions that can do this kind of operation? Also perhaps
> > unnecessarily slow is the conversion of 'arr' to a float -- I do this
> > to avoid overflow issues with integer arrays.
> >
>
> How about a combination of sort, followed by searchsorted right/left using
> the bin boundaries as keys? The difference of the two resulting vectors is
> the bin value. Something like:
>
> In [1]: data = arange(100)
>
> In [2]: bins = [0,10,50,70,100]
>
> In [3]: lind = data.searchsorted(bins)
>
> In [4]: print lind[1:] - lind[:-1]
> [10 40 20 30]
>
> This won't be as fast as a c implementation, but at least avoids the loop.
>
> Chuck
How many bits per pixel does your camera actually generate !?
If its for example a 12 bit camera, you could just fill in directly
into 4096 preallocated bins.
You would not need any sorting !!
That's what I did for a 16 bit camera -- but I wrote it in C and I had
4 cameras at 30 Hz.
-Sebastian Haase
More information about the Numpy-discussion
mailing list