[Numpy-discussion] Medians that ignore values

Pierre GM pgmdevlist@gmail....
Fri Sep 19 10:46:37 CDT 2008


On Friday 19 September 2008 11:36:17 Alan G Isaac wrote:
> On 9/19/2008 11:09 AM Stefan Van der Walt apparently wrote:
> > Masked arrays.  Using NaN's for missing values is dangerous.  You may
> > do some operation, which generates invalid results, and then you have
> > a mixed bag of missing and invalid values.
>
> That rather evades my full question, I think?
>
> In the case I mentioned,
> I am filling an array inside a loop,
> and the possible fill values are not constrained.
> So I cannot mask based on value,
> and I cannot mask based on position
> (at least until after the computations are complete).

No, but you may do the opposite: just start with an array completely masked, 
and unmasked it as you need:
Say, you have  4x5 array, and want to unmask (0,0), (1,2), (3,4)
>>> a = ma.empty((4,5), dtype=float)
>>> a.mask=True
>>> a[0,0] = 0
>>> a[1,2]=1
>>> a[3,4]=3
>>>a 
masked_array(data =
 [[0.0 -- -- -- --]
 [-- -- 1.0 -- --]
 [-- -- -- -- --]
 [-- -- -- -- 3.0]],
      mask =
 [[False  True  True  True  True]
 [ True  True False  True  True]
 [ True  True  True  True  True]
 [ True  True  True  True False]],
      fill_value=1e+20)
>>>a.max(axis=0)
masked_array(data = [0.0 -- 1.0 -- 3.0],
      mask = [False  True False  True False],
      fill_value=1e+20)


> It seems to me that there are pragmatic reasons
> why people work with NaNs for missing values,
> that perhaps shd not be dismissed so quickly.
> But maybe I am overlooking a simple solution.

nansomething solutions tend to be considerably faster, that might be one 
reason. A lack of visibility of numpy.ma could be a second. In any case, I 
can't but agree with other posters: a NaN in an array usually means something 
went astray.

> PS I confess I do not understand NaNs.
> E.g., why could there not be a value np.miss
> that would be a NaN that represents a missing value?

You can't compare NaNs to anything. How do you know this np.miss is a masked 
value, when np.sqrt(-1.) is NaN ?






More information about the Numpy-discussion mailing list