[Numpy-discussion] Docstring improvements for numpy.where?

Fernando Perez fperez.net@gmail....
Wed Sep 12 20:14:07 CDT 2007


Hi all,

A couple of times I've been confused by numpy.where(), and I think
part of it comes from the docstring.  Searching my gmail archive seems
to indicate I'm not the only one bitten by this.

Compare:

In [14]: pdoc numpy.where
Class Docstring:
    where(condition, | x, y)

    The result is shaped like condition and has elements of x and y where
    condition is respectively true or false.  If x or y are not given,
    then it is equivalent to condition.nonzero().

    To group the indices by element, rather than dimension, use

        transpose(where(condition, | x, y))

    instead. This always results in a 2d array, with a row of indices for
    each element that satisfies the condition.

with (b is just any array):

In [17]: pdoc b.nonzero
Class Docstring:
    a.nonzero() returns a tuple of arrays

    Returns a tuple of arrays, one for each dimension of a,
    containing the indices of the non-zero elements in that
    dimension.  The corresponding non-zero values can be obtained
    with
        a[a.nonzero()].

    To group the indices by element, rather than dimension, use
        transpose(a.nonzero())
    instead. The result of this is always a 2d array, with a row for
    each non-zero element.;


The sentence "The result is shaped like condition" in the where()
docstring is misleading, since the behavior is really that of
nonzero().  Where() *always* returns a tuple, not an array shaped like
condition.  If this were more clearly explained, along with a simple
example for the usual case that seems to trip everyone:

In [21]: a=arange(10)

In [22]: N.where(a>5)
Out[22]: (array([6, 7, 8, 9]),)

In [23]: N.where(a>5)[0]
Out[23]: array([6, 7, 8, 9])

I think we'd get a lot less confusion.

Or am I missing something, or just being dense (quite likely)?

Cheers,

f


More information about the Numpy-discussion mailing list