[SciPy-dev] Statistics toolbox and nans
oliphant.travis at ieee.org
Thu Oct 31 23:57:20 CST 2002
What should we do about nan's and the stats toolbox. Stats is one
package where people may use nans to represent missing values.
There are two options that I see.
1) MATLAB option
MATLAB defines 6 new functions nanmean, nanmedian, nansum, nanmin,
nanmax, and nanstd that ignore nans properly. These can be used in
place of the normal functions which don't use nans properly. Perhaps
they did this as an afterthought.
Note, this is an easy option and is (as of now) implemented in the CVS
Other stats functions may or may not handle nan's properly.
2) Integrated option
All stats functions handle nan's properly
The drawback to Option 2 which is less difficult to explain is that
every function is saddled with isnan checking which may slow things down
Using Knuth's policy of not optimizing prematurely. I tend toward
Are there any other options anybody sees.
More information about the Scipy-dev