[Numpy-discussion] Histograms via indirect index arrays

Norbert Nemec Norbert.Nemec.list at gmx.de
Thu Mar 16 08:01:02 CST 2006

I have a very much related problem: Not only that the idea described by
Mads Ipsen does not work, but I could generally find no efficient way to
do a "counting" of elements in an array, as it is needed for a histogram.

The function "histogram" contained in numpy uses a rather inefficient
method, involving the sorting of the data array.

What would instead be needed is a function that simply gives the count
of occurances of given values in a given array:

>>> [4,5,2,3,2,1,4].count([0,1,2,3,4,5])

All the solutions that I found so far involve either sorting of the data
or writing a loop in Python, both of which are unacceptable for performance.

Am I missing something obvious?

Mads Ipsen wrote:

>First of all, thanks for the new release.
>Here's another question regarding something I cannot quite understand:
>Suppose you want to update bins for a histogram, you might think you
>could do something like:
>  g = zeros(4,Int)
>  x = array([0.2, 0.2])
>  idx = floor(x/0.1).astype(int)
>  g[idx] += 1
>Here idx becomes
>   array([2, 2])
>In this case, I would naively expect g to end up like
>  array([0, 0, 2, 0])                     (1)
>but instead you get
>  array([0, 0, 1, 0])                     (2)
>Is this intended? Just being plain novice-like naive, I would expect
>the slice operation g[idx] += 1 to do something like
>  for i in range(len(I)):
>    g[ idx[i] ] += 1
>resulting in (1) and not (2).
>// Mads
>This SF.Net email is sponsored by xPML, a groundbreaking scripting language
>that extends applications into web and mobile media. Attend the live webcast
>and join the prime developer group breaking into this new coding territory!
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net

More information about the Numpy-discussion mailing list