[Numpy-discussion] MaskedArray __setitem__ Performance
Pierre GM
pgmdevlist@gmail....
Sat Feb 16 14:21:32 CST 2008
> Can I safely carry around the data, mask and MaskedArray? I'm
> considering working along the lines of the following conceptual
> outline:
That depends a lot on what calculate_results does, and whether you update the
arrays in place or not.
> d = numpy.array(shape, dtype)
> m = numpy.array(shape, bool)
> a = numpy.ma.MaskedArray(d, m)
You should be able to update d and m, and have the changes passed to a (as
long as you're not using copy=True). You have to make sure that m has indeed
a dtype of MaskType (or bool), else you'll break the connection.
Explanation: in MaskedArray.__new__, the mask argument is converted to a dtype
of MaskType (bool): if the mask is originally in integer, for example, a copy
is made, and the _mask of your masked array does not point to `mask`. For
example:
>>>d=numpy.array([1,2,3])
>>>m=numpy.array([0,0,1])
>>>x=numpy.ma.array(d,mask=m)
>>>x
[1 2 --]
>>>d[0]=17
>>>x
[17 2 --]
OK, x is properly updated. If now we try to change the mask:
>>>m[0]=1
>>>x
[17 2 --]
x is not updated, as x._mask doesn't point to m, but to a copy of m as the
dtype changed from int to bool.
Now, if we ensure that m is an array of booleans:
>>>d=numpy.array([1,2,3])
>>>m=numpy.array([0,0,1], dtype=bool)
>>>x=numpy.ma.array(d,mask=m)
>>>print x
[1 2 --]
>>>d[0]=17
>>>print x
[17 2 --]
>>>m[0]=1
>>>print x
[-- 2 --]
m was of the correct dtype in the first place, so no copy is made, and x._mask
does point to m.
In short: in your example, updating d and m should work and be more efficient
than updating a directly.
More information about the Numpy-discussion
mailing list