[Numpy-discussion] fast method to to count a particular value in a large matrix

Warren Weckesser warren.weckesser@enthought....
Sat Feb 4 15:04:51 CST 2012


On Sat, Feb 4, 2012 at 2:35 PM, Benjamin Root <ben.root@ou.edu> wrote:

>
>
> On Saturday, February 4, 2012, Naresh Pai <npai@uark.edu> wrote:
> > I am somewhat new to Python (been coding with Matlab mostly). I am
> trying to
> > simplify (and expedite) a piece of code that is currently a bottleneck
> in a larger
> > code.
> > I have a large array (7000 rows x 4500 columns) titled say, abc, and I
> am trying
> > to find a fast method to count the number of instances of each unique
> value within
> > it. All unique values are stored in a variable, say, unique_elem. My
> current code
> > is as follows:
> > import numpy as np
> > #allocate space for storing element count
> > elem_count = zeros((len(unique_elem),1))
> > #loop through and count number of unique_elem
> > for i in range(len(unique_elem)):
> >    elem_count[i]= np.sum(reduce(np.logical_or,(abc== x for x
> in [unique_elem[i]])))
> > This loop is bottleneck because I have about 850 unique elements and it
> takes
> > about 9-10 minutes. Can you suggest a faster way to do this?
> > Thank you,
> > Naresh
> >
>
> no.unique() can return indices and reverse indices.  It would be trivial
> to histogram the reverse indices using np.histogram().
>
>

Instead of histogram(), you can use bincount() on the inverse indices:

u, inv = np.unique(abc, return_inverse=True)
n = np.bincount(inv)


u will be an array of the unique elements, and n will be an array of the
corresponding number of occurrences.

Warren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20120204/ee2eec85/attachment.html 


More information about the NumPy-Discussion mailing list