[NumPy-Tickets] [NumPy] #1974: Subscript assignment on `hard_mask` masked array produces wrong result whenever the subscripted mask returns all False (i.e. `nomask`)
NumPy Trac
numpy-tickets@scipy....
Tue Nov 8 12:17:18 CST 2011
#1974: Subscript assignment on `hard_mask` masked array produces wrong result
whenever the subscripted mask returns all False (i.e. `nomask`)
----------------------------------+-----------------------------------------
Reporter: codewarrior | Owner: pierregm
Type: defect | Status: new
Priority: normal | Milestone: Unscheduled
Component: numpy.ma | Version: 1.6.0
Keywords: MaskedArray, setitem |
----------------------------------+-----------------------------------------
Using subscript assignment on a `hard_mask` masked array with a slice or
other index that selects a fully unmasked section of the masked array will
result in the second item of the section being assigned to instead of the
entire section.
Upon investigating, it is because mask_or reduces the fully unmasked
section of the mask to a single 'nomask' value, which is negated and used
to subscript-assign the section of the data array. This is incorrect
because negating 'nomask' produces a single 'True' value, which turns into
a '1' and selects the second item of the section.
This is apparently due to a brain fart on line 3038 of numpy.ma.core.py,
in the final else: clause of MaskedArray.__setitem__()
The original code reads:
{{{
mindx = mask_or(_mask[indx], mval, copy=True)
dindx = self._data[indx]
if dindx.size > 1:
dindx[~mindx] = dval
elif mindx is nomask:
dindx = dval
}}}
It seems like it should be checking 'mindx.size' and not 'dindx.size'
because mask_or is free to shrink the return value down to 'nomask' which
would have size 1. When I use this corrected code, I no longer observe the
problem:
{{{
mindx = mask_or(_mask[indx], mval, copy=True)
dindx = self._data[indx]
if mindx.size > 1:
dindx[~mindx] = dval
elif mindx is nomask:
dindx = dval
}}}
This was apparently fixed in
https://github.com/numpy/numpy/commit/a6e869b70b09df9381d341ed0d2b18f88d8fe3d6
but that fix can't be backported to 1.6 because it uses np.copyto().
Here is code that demonstrates the error.
{{{
>>> from numpy import *
>>> a = arange(30)
>>> a.shape=5,6
>>> b = zeros_like(a)
>>> m = ma.masked_array(a,b,hard_mask=True) #only happens when hard_mask
is True
>>> m
masked_array(data =
[[0 1 2 3 4 5]
[6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]
[24 25 26 27 28 29]],
mask =
[[False False False False False False]
[False False False False False False]
[False False False False False False]
[False False False False False False]
[False False False False False False]],
fill_value = 999999)
>>> m[:] = 333
>>> m #entire array should be 333 now
masked_array(data =
[[0 1 2 3 4 5]
[333 333 333 333 333 333] #uh-oh, only the second element is set
[12 13 14 15 16 17]
[18 19 20 21 22 23]
[24 25 26 27 28 29]],
mask =
[[False False False False False False]
[False False False False False False]
[False False False False False False]
[False False False False False False]
[False False False False False False]],
fill_value = 999999)
}}}
--
Ticket URL: <http://projects.scipy.org/numpy/ticket/1974>
NumPy <http://projects.scipy.org/numpy>
My example project
More information about the NumPy-Tickets
mailing list