[SciPy-dev] scipy.stats: test for handling nan values
Thu Jul 26 11:13:43 CDT 2007
Yes, I would agree that treating missing as nan would achieve the
desired results if the stats functions are set to ignore nan. But it
is not technically correct to treat missing as nan because you get a
non-missing value by valid operations (like division by zero).
Treating missing as nan really makes the bad assumption that the
person using those functions 'knows' this difference.
One solution is actually using something like masked arrays because a
user can set their own coding for missing values.
I think there is a very related thread that was discussed some time
ago on the Numpy list with the title 'Re: ndarray.fill and
ma.array.filled' by Sasha:
On 7/25/07, David Cournapeau <email@example.com> wrote:
> Trying to solve a few tickets related to nanmean and co, I wanted to
> add tests for those functions, as well as general behaviour of basic
> statistics function with nan. Part of the test suite (in test_stats.py)
> is based on the Statistical quiz for Wilkinson; missing values are not
> supported. If I finish the test suite by implemenging MISSING by nan
> values, is this conceptually correct or not ? I wanted to be sure before
> committing the change in the test suite (actually, only adding
> originally disabled tests)
> Scipy-dev mailing list
More information about the Scipy-dev