[Numpy-discussion] min() of array containing NaN

Joe Harrington jh@physics.ucf....
Tue Aug 12 16:22:08 CDT 2008


> It really isn't very hard to replace
> np.sum(A)
> with
> np.sum(A[~isnan(A)])
> if you want to ignore NaNs instead of propagating them. So I don't
> feel a need for special code in sum() that treats NaN as 0.

That's all well and good, until you want to set the axis= keyword.
Then you're stuck with looping.  As doing stats for each pixel column
in a stack of astronomical images with bad pixels and cosmic-ray hits
is one of the most common actions in astronomical data analysis, this
is an issue for a significant number of current and future users.

>>> a=np.arange(9, dtype=float)
>>> a.shape=(3,3)
>>> a[1,1]=np.nan
>>> a
array([[ 0.        ,  1.        ,  2.        ],
       [ 3.        ,         nan,  5.        ],
       [ 6.        ,  7.        ,  8.        ]])
>>> np.sum(a)
nan
>>> np.sum(a[~np.isnan(a)])
32.0

Good, but...

>>> np.sum(a[~np.isnan(a)], axis=1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.5/site-packages/numpy/core/fromnumeric.py", line 634, in sum
    return sum(axis, dtype, out)
ValueError: axis(=1) out of bounds

Uh-oh...

>>> np.sum(a[~np.isnan(a)], axis=0)
32.0

Worse: wrong answer but not an exception, since

>>> a[~np.isnan(a)] 
array([ 0.,  1.,  2.,  3.,  5.,  6.,  7.,  8.])

has the undesired side effect of irreversibly flattening the array.

--jh--


More information about the Numpy-discussion mailing list