[Numpy-discussion] min() of array containing NaN

Joe Harrington jh@physics.ucf....
Thu Aug 14 01:29:12 CDT 2008


> I'm doing nothing. Someone else must volunteer.

Fair enough.  Would the code be accepted if contributed?

> There is a
> reasonable design rule that if you have a boolean argument which you
> expect to only be passed literal Trues and Falses, you should instead
> just have two different functions.

Robert, can you list some reasons to favor this design rule?  

Here are some reasons to favor richly functional routines:

User's code is more readable because subtle differences affect args,
   not functions
Easier learning for new users
Much briefer and more readable docs
Similar behavior across languages
Smaller number of functions in the core package (a recent list topic)
Many fewer routines to maintain, particularly if multiple switches exist
Availability of the NaN functionality in a method of ndarray

The last point is key.  The NaN behavior is central to analyzing real
data containing unavoidable bad values, which is the bread and butter
of a substantial fraction of the user base.  In the languages they're
switching from, handling NaNs is just part of doing business, and is
an option of every relevant routine; there's no need for redundant
sets of routines.  In contrast, numpy appears to consider data
analysis to be secondary, somehow, to pure math, and takes the NaN
functionality out of routines like min() and std().  This means it's
not possible to use many ndarray methods.  If we're ready to handle a
NaN by returning it, why not enable the more useful behavior of
ignoring it, at user discretion?

--jh--


More information about the Numpy-discussion mailing list