[Numpy-discussion] counting non-zero entries in an ndarray

Jonathan Rocher jrocher@enthought....
Wed Dec 22 14:29:54 CST 2010


To answer the part about the most efficient way to do that,

In [1]: a = array([0,1,4,76,3,0,4,67,9,5,3,9,0,5,23,3,0,5,3,3,0,5,0])

In [8]: %timeit len(where(a!=0)[0])
100000 loops, best of 3: 6.54 us per loop

In [9]: %timeit (a!=0).sum()
100000 loops, best of 3: 9.81 us per loop

Seems like the where option is faster.

Now I create a large array
In [13]: a = hstack([a,a,a,a,a,a,a,a,a,a,a,a])

In [14]: %timeit len(where(a!=0)[0])
100000 loops, best of 3: 12.3 us per loop

In [15]: %timeit (a!=0).sum()
100000 loops, best of 3: 11 us per loop

Now the fastest way is using the sum. The where function is not vectorized
because it doesn't know in advance the size of the final array. In the case
of a big array, there will be a lot of copy in the memory, as it grows. And
the difference increases fast...

In [20]: a = hstack([a,a,a,a,a,a,a,a,a,a,a,a])

In [21]: %timeit len(where(a!=0)[0])
10000 loops, best of 3: 79.1 us per loop

In [22]: %timeit (a!=0).sum()
10000 loops, best of 3: 24.5 us per loop

Regards,
Jonathan

On Wed, Dec 22, 2010 at 11:43 AM, Thomas K Gamble <tkg@lanl.gov> wrote:

> On Wednesday, December 22, 2010 07:16:17 am Ian Stokes-Rees wrote:
> > What is the most efficient way to do the Matlab equivalent of nnz(M)
> > (nnz = number-of-non-zeros function)?
> >
> > I've tried Google, but no luck.
> >
> > My assumption is that something like
> >
> > a != 0
> >
> > will be used, but I'm not sure then how to "count" the number of "True"
> > entries.
> >
> > TIA.
> >
> > Ian
>
> one possibility:
>
> len(where(a != 0)[0])
>
> --
> Thomas K. Gamble
> Research Technologist, System/Network Administrator
> Chemical Diagnostics and Engineering (C-CDE)
> Los Alamos National Laboratory
> MS-E543,p:505-665-4323 f:505-665-4267
>
> There cannot be a crisis next week. My schedule is already full.
>    Henry Kissinger
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 
Jonathan Rocher,
Enthought, Inc.
jrocher@enthought.com
1-512-536-1057
http://www.enthought.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20101222/ba0869a6/attachment.html 


More information about the NumPy-Discussion mailing list