[Numpy-discussion] MaskedArray __setitem__ Performance

Alexander Michael lxander.m@gmail....
Sat Feb 16 15:41:12 CST 2008


On Feb 16, 2008 3:21 PM, Pierre GM <pgmdevlist@gmail.com> wrote:
> > Can I safely carry around the data, mask and MaskedArray? I'm
> > considering working along the lines of the following conceptual
> > outline:
>
> That depends a lot on what calculate_results does, and whether you update the
> arrays in place or not.
>
> > d = numpy.array(shape, dtype)
> > m = numpy.array(shape, bool)
> > a = numpy.ma.MaskedArray(d, m)
>
> You should be able to update d and m, and have the changes passed to a (as
> long as you're not using copy=True). You have to make sure that m has indeed
> a dtype of MaskType (or bool), else you'll break the connection.
>
> Explanation: in MaskedArray.__new__, the mask argument is converted to a dtype
> of MaskType (bool): if the mask is originally in integer, for example, a copy
> is made, and the _mask of your masked array does not point to `mask`. For
> example:
> >>>d=numpy.array([1,2,3])
> >>>m=numpy.array([0,0,1])
> >>>x=numpy.ma.array(d,mask=m)
> >>>x
> [1 2 --]
> >>>d[0]=17
> >>>x
> [17 2 --]
>
> OK, x is properly updated. If now we try to change the mask:
>
> >>>m[0]=1
> >>>x
> [17 2 --]
>
> x is not updated, as x._mask doesn't point to m, but to a copy of m as the
> dtype changed from int to bool.
> Now, if we ensure that m is an array of booleans:
> >>>d=numpy.array([1,2,3])
> >>>m=numpy.array([0,0,1], dtype=bool)
> >>>x=numpy.ma.array(d,mask=m)
> >>>print x
> [1 2 --]
> >>>d[0]=17
> >>>print x
> [17 2 --]
> >>>m[0]=1
> >>>print x
> [-- 2 --]
> m was of the correct dtype in the first place, so no copy is made, and x._mask
> does point to m.
>
> In short: in your example, updating d and m should work and be more efficient
> than updating a directly.

Cool. Thanks!


More information about the Numpy-discussion mailing list