[Numpy-discussion] Trouble With MaskedArray and Shared Masks
Pierre GM
pgmdevlist@gmail....
Tue Feb 26 13:32:05 CST 2008
Alexander,
The rationale behind the current behavior is to avoid an accidental
propagation of the mask. Consider the following example:
>>>m = numpy.array([1,0,0,1,0], dtype=bool_)
>>>x = numpy.array([1,2,3,4,5])
>>>y = numpy.sqrt([5,4,3,2,1])
>>>mx = masked_array(x,mask=m)
>>>my = masked_array(y,mask=m)
>>>mx[0] = 0
>>>print mx,my, m
[0 2 3 -- 5] [-- 4 3 -- 1] [ True False False True False]
At the creation, mx._sharedmask and my._sharedmask are both True. Setting
mx[0]=0 forces mx._mask to be copied, so that we don't affect the mask of my.
Now,
>>>m = numpy.array([1,0,0,1,0], dtype=bool_)
>>>x = numpy.array([1,2,3,4,5])
>>>y = numpy.sqrt([5,4,3,2,1])
>>>mx = masked_array(x,mask=m)
>>>my = masked_array(y,mask=m)
>>>mx._sharedmask = False
>>>mx[0] = 0
>>>print mx,my, m
[0 2 3 -- 5] [5 4 3 -- 1] [False False False True False]
By mx._sharedmask=False, we deceived numpy.ma into thinking that it's OK to
update the mask of mx (that is, m), and my gets updated. Sometimes it's what
you want (your case for example), often it is not: I've been bitten more than
once before reintroducing the _sharedmask flag.
As you've observed, setting a private flag isn't a very good idea: you should
use the .unshare_mask() function instead, that copies the mask and set the
_sharedmask to False. OK, in your example, copying the mask is not needed,
but in more general cases, it is.
At the initialization, self._sharedmask is set to (not copy). That is, if you
didn't specify copy=True at the creation (the default being copy=False),
self._sharedmask is True. Now, I recognize it's not obvious, and perhaps we
could introduce yet another parameter to masked_array/array/MaskedArray,
share_mask, that would take a default value of True and set
self._sharedmask=(not copy)&share_mask
So: should we introduce this extra parameter ?
In any case, I hope it helps.
P.
More information about the Numpy-discussion
mailing list