[Numpy-discussion] A reimplementation of MaskedArray
pgmdevlist at gmail.com
Tue Nov 21 22:09:17 CST 2006
On Tuesday 21 November 2006 21:11, Michael Sorich wrote:
> I think that the new implementation is making a copy of the data with
> indexing a MA. This is different from both ndarray and the existing
> numpy ma version.
If you check the definition of MaskedArray.__new__, you'll see that the "copy" argument is set to True by default. Setting it to "false" seems to give what you expect. Should I make the default ?
> Having subviews of the mask seems complicated with the mask being
Why ? nomask is just a trick to avoid unnecessary computations on a mask full of False that doesn't need updating.
> What happens if the view sets a new masked value and hence
> changes from nomask to an boolean array ?
> How does the parent mask get updated?
Both implementations work the same way: the parent mask is not updated.
> I think the numpy implementation gets away with this by
> returning a view of only the _data part if the ma mask is nomask
By numpy implementation, you mean numpy.core.ma, right ?
If so, then yes:
`self.__getitem__[i]` returns `self._data[i]` if the mask is nomask.
In maskedarray, if the mask is nomask, then
`self.__getitem__[i]` returns `self._data[i]` only if `self._data[i].size==1`, else it returns a masked array.
> I don't like this solution as I would expect a ma to be returned. Also I
> suspect that if the ma is to be a view of another ma, then in __new__
> a mask that is a boolean array of all False cannot be converted to
I'm not following you here: there's no `__new__` in numpy.core.ma (that's one of the reason why a masked array in numpy.core.ma is basically different from a ndarray...). And in maskedarray, a mask as array of `False` is set to `nomask` by default, but you can use the `flag` option: please check the documentation of `maskedarray.masked_array`: flag=True converts the mask, flags=False keeps an array of boolean.
One thing to remember is that masks tend to be copied more often than not. And I don't think it's advisable to modify the mask of the parent: it's no longer the same object, as the mask is now different ! In other terms, you could share data, you shouldn't share a mask. And I keep getting bitten with data sharing, that's why I had set the 'copy' flag to True by default.
> I like the new implementation of maskedarray, especially the focus on
> simplicity. The only simple solution I see is to have the mask be a
> boolean array at all times....
You haven't convinced me yet of why a mask of False is better than `nomask`.
What don't you like in maskedarray (aka the new implementation) ?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Numpy-discussion