[Numpy-discussion] bug with with fill_values in masked arrays?
Pierre GM
pgmdevlist@gmail....
Fri Mar 21 19:24:40 CDT 2008
On Friday 21 March 2008 12:55:11 Chris Withers wrote:
> Pierre GM wrote:
> > On Wednesday 19 March 2008 19:47:37 Matt Knox wrote:
> >>> 1. why am I not getting my NaN's back?
> >
> > Because they're gone when you create your masked array.
>
> Really? At least one other post has disagreed with that.
Well, yeah, my bad, that depends on whether you use masked_invalid or
fix_invalid or just build a basic masked array.
Example:
>>>import numpy as np
>>>import numpy.ma as ma
>>>x = np.array([1,np.nan,3])
>>># Basic construction
>>>y=ma.array(x)
masked_array(data = [ 1. NaN 3.],
mask = False,
fill_value=1e+20)
>>>y=ma.masked_invalid(x)
masked_array(data = [1.0 -- 3.0],
mask = [False True False],
fill_value=1e+20)
>>>y._data
array([ 1., NaN, 3.])
>>>y=ma.fix_invalid(x)
masked_array(data = [1.0 -- 3.0],
mask = [False True False],
fill_value=1e+20)
>>>y._data
array([ 1.00000000e+00, 1.00000000e+20, 3.00000000e+00])
> And it does seem odd that a value, even if it's a nan, would be
> destroyed...
Having NaNs in an array usually reduces performance: the option we follow w/
fix_invalid is to clear the masked array of the NaNs, and keeping track of
where they were by setting the mask to True at the appropriate location. That
way, you don't have the drop of performance of having NaNs in your underlying
array.
Oh, and NaNs will be transformed to 0 if you use ints...
> > The idea here is to
> > get rid of the nan in your data
>
> No, it's to mask them, otherwise I would have used a normal array, not a
> ma.
Nope, the idea is really is to make things as efficient as possible. Now, you
can still have your nans if you're ready to eat them.
> > to avoid potential problems while keeping
> > track of where the nans were in the first place.
>
> ...like plotting them on a graph, which the current behaviour makes
> unworkable, that you end up doing a myarray.filled(0) to get around it,
> with imperfect results.
Send an example. I don't seem to have this problem:
x = np.arange(10,dtype=np.float)
x[5]=np.nan
y=ma.masked_invalid(x)
plot(x,'ok-')
plot(y,'sr-')
> Right, but why when the masked array is cast back to a list of numbers
> if the fill_value of the ma not respected?
Because in your particular case, you're inspecting elements one by one, and
then, your masked data becomes the masked singleton which is a special value.
That has nothing to do w/ the filling.
> >>> 2. why is the wrong fill value being used here?
> >>
> >> the second element in the array iteration here is actually the
> >> numpy.ma.masked constant, which always has the same fill value...
>
> ...and that's a bug.
And once again, it's not. numpy.ma.masked is a special value, like numpy.nan
or numpy.inf
More information about the Numpy-discussion
mailing list