[Numpy-discussion] Translation of Matlab function find()

Francesc Altet faltet@carabos....
Wed Mar 21 14:34:13 CDT 2007


El dc 21 de 03 del 2007 a les 09:23 +0100, en/na Miquel Poch va
escriure:
> Hi,
> 
> I'm trying to translate some Matlab functions. I don't know exactly
> how make and equivalent for function find(). This give us the index of
> an array where a condition it's true.
> 
> I've found some options like this one: (a>0).nonzero(), where a is the
> array and (a>0) the condition. But the problem appear with this
> function return. It's something like this: 
> 
> >>> from numpy import *
> >>> a=array([0,1,0,1,2,0,1,2,0,0])
> >>> print a
> [0 1 0 1 2 0 1 2 0 0]
> >>> b=(a>0).nonzero()
> >>> print b
> (array([1, 3, 4, 6, 7]),) 
> >>> print (a>0).nonzero()
> (array([1, 3, 4, 6, 7]),)
> 
> I can't acces to b data, doing something like b[x]. Why b is not an
> array? 

This is in order to allow generality. Think about this:

>>> b=numpy.array([0,1,0,1,2,0,1,2,0,0]).reshape(2,5)
>>> numpy.where(b>0)
(array([0, 0, 0, 1, 1]), array([1, 3, 4, 1, 2]))

The result are the coordinates of the elements that are greater than
zero, where in the first result array are the 1st dimension coords and
in the second are the second dimension ones. If you want to group the
indices by element, rather than dimension, use:

>>> numpy.transpose(numpy.where(b>0))
array([[0, 1],
       [0, 3],
       [0, 4],
       [1, 1],
       [1, 2]])

Extrapolating this, for a 1-dimensional a array, we have:

>>> a=numpy.array([0,1,0,1,2,0,1,2,0,0])
>>> numpy.where(a>0)
(array([1, 3, 4, 6, 7]),)

if this is not what you want, just index the result:
>>> c = numpy.where(a>0)[0]
>>> c
array([1, 3, 4, 6, 7])

and you are done.

> I think another possible is use where() function, also avilable in
> numpy library. It's better or worse?

Well, where() does more things than .nonzero() (which is a particular
case of where()); do a help(numpy.where) for more info. However, they
perform similar:

>>> t1=Timer('numpy.where(a>0)', 'import numpy; a=numpy.arange(10)')
>>> t2=Timer('(a>0).nonzero()', 'import numpy; a=numpy.arange(10)')
>>> t1.repeat(3, 10000)
[0.27704501152038574, 0.23136401176452637, 0.23428702354431152]
>>> t2.repeat(3, 10000)
[0.26156210899353027, 0.21894097328186035, 0.21954011917114258]

so, use whatever you prefer.

Salutacions,

-- 
Francesc Altet    |  Be careful about using the following code --
Carabos Coop. V.  |  I've only proven that it works, 
www.carabos.com   |  I haven't tested it. -- Donald Knuth



More information about the Numpy-discussion mailing list