[Numpy-discussion] masked arrays as array indices (is a bad idea)
Pierre GM
pgmdevlist@gmail....
Mon Sep 21 13:43:26 CDT 2009
On Sep 21, 2009, at 12:17 PM, Ryan May wrote:
> 2009/9/21 Ernest Adrogué <eadrogue@gmx.net>
> Hello there,
>
> Given a masked array such as this one:
>
> In [19]: x = np.ma.masked_equal([-1, -1, 0, -1, 2], -1)
>
> In [20]: x
> Out[20]:
> masked_array(data = [-- -- 0 -- 2],
> mask = [ True True False True False],
> fill_value = 999999)
>
> When you make an assignemnt in the vein of x[x == 0] = 25
> the result can be a bit puzzling:
>
> In [21]: x[x == 0] = 25
>
> In [22]: x
> Out[22]:
> masked_array(data = [25 25 25 25 2],
> mask = [False False False False False],
> fill_value = 999999)
>
> Is this the correct result or have I found a bug?
>
> I see the same here on 1.4.0.dev7400. Seems pretty odd to me. Then
> again, it's a bit more complex using masked boolean arrays for
> indexing since you have True, False, and masked values. Anyone have
> thoughts on what *should* happen here? Or is this it?
Using a masked array in fancy indexing is always a bad idea, as
there's no way of guessing the behavior one would want for missing
values: should they be evaluated as False ? True ? You should really
use the `filled` method to control the behavior.
>>> x[(x==0).filled(False)]
masked_array(data = [0],
mask = [False],
fill_value = 999999)
>>>x[(x==0).filled(True)]
masked_array(data = [-- -- 0 --],
mask = [ True True False True],
fill_value = 999999)
P.
[If you're really interested:
When testing for equality, a masked array is first filled with 0 (that
was the behavior of the first implementation of numpy.ma), tested for
equality, and the mask of the result set to the mask of the input.
When used in fancy indexing, a masked array is viewed as a standard
ndarray by dropping the mask. In the current case, the combination is
therefore equivalent to (x.filled(0)==0), which explains why the
missing values are treated as True... I agree that the prefilling may
not be necessary...]
More information about the NumPy-Discussion
mailing list