[Numpy-discussion] min() of array containing NaN
Joe Harrington
jh@physics.ucf....
Tue Aug 12 16:22:08 CDT 2008
> It really isn't very hard to replace
> np.sum(A)
> with
> np.sum(A[~isnan(A)])
> if you want to ignore NaNs instead of propagating them. So I don't
> feel a need for special code in sum() that treats NaN as 0.
That's all well and good, until you want to set the axis= keyword.
Then you're stuck with looping. As doing stats for each pixel column
in a stack of astronomical images with bad pixels and cosmic-ray hits
is one of the most common actions in astronomical data analysis, this
is an issue for a significant number of current and future users.
>>> a=np.arange(9, dtype=float)
>>> a.shape=(3,3)
>>> a[1,1]=np.nan
>>> a
array([[ 0. , 1. , 2. ],
[ 3. , nan, 5. ],
[ 6. , 7. , 8. ]])
>>> np.sum(a)
nan
>>> np.sum(a[~np.isnan(a)])
32.0
Good, but...
>>> np.sum(a[~np.isnan(a)], axis=1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.5/site-packages/numpy/core/fromnumeric.py", line 634, in sum
return sum(axis, dtype, out)
ValueError: axis(=1) out of bounds
Uh-oh...
>>> np.sum(a[~np.isnan(a)], axis=0)
32.0
Worse: wrong answer but not an exception, since
>>> a[~np.isnan(a)]
array([ 0., 1., 2., 3., 5., 6., 7., 8.])
has the undesired side effect of irreversibly flattening the array.
--jh--
More information about the Numpy-discussion
mailing list