A reimplementation of MaskedArray

Michael Sorich michael.sorich at gmail.com
Tue Nov 7 19:11:24 CST 2006


On 10/25/06, Pierre GM <pgmdevlist at gmail.com> wrote:
> On Tuesday 24 October 2006 02:50, Michael Sorich wrote:
> > I am currently running numpy rc2 (I haven't tried your
> > reimplementation yet as I am still using python 2.3). I am wondering
> > whether the new maskedarray is able to handle construction of arrays
> > from masked scalar values (not sure if this is the correct term).
>
> The answer is no, unfortunately: both the new and old implementations fail at
> the same point, raising a TypeError: lists are processed through
> numpy.core.numeric.array, which doesn't handle that.

I have finally gotten around to upgrading to python 2.4 and have had a
chance to play with your new version of the MaskedArray. It is great
to see that someone is working on this. I have a few thoughts on
masked arrays that may or may no warrant discusion

1. It would be nice if the masked_singleton could be passed into a
ndarray, as this would allow it to be passed into the MaskedArray e.g.

import numpy as N
import ma.maskedarray as MA
test = N.array([1,2,MA.masked])
>> ValueError: setting an array element with a sequence

If the masked_singleton was implemented as an object that is not a
MakedArray (which is a sequence that numpy.array chokes on), then a
valid numpy array with an object dtype could be produced. e.g.

class MaskedScalar:
    def __str__(self):
        return 'masked'
masked = MaskedScalar()

test = N.array([1,2,masked])
print test.dtype, test
>>object [1 2 masked]
print test == masked
>>[False False True]
print test[2] == masked
>>True
print test[2] is masked
>>True

Then it would be possible to alternatively define a masked array as
MA.array([1,2,masked]) or MA.array(N.array([1,2,masked])). In the
__init__ of the MaskedArray if the ndarray has an object dtype simply
calculate the mask from a==masked.

2. What happens if a masked array is passed into a ndarray or passed
into a MaskedArray with another mask?

test_ma1 = MA.array([1,2,3], mask=[False, False, True])
print test_ma1
>>[1 2 --]

print N.array(test_ma1)
>>[1 2 3]

test_ma2 = MA.array(test_ma1, mask=[True, False, False])
print test_ma2
>>[-- 2 3]

I suppose it depends on whether you are masking useful data, or the
masks represent missing data. In the former it may make sense to
change or remove the mask. However in the latter case the original
data entered is a bogus value which should never be unmasked. In this
case, when converting to a ndarray I think it make more sense to make
an object ndarray with the missing value containing the masked
singleton. Additionally, if the MaskedArray is holding missing data,
it does not make much sense to be able to pass in to the MA
constructor both an existing ma and a mask.

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642




More information about the Numpy-discussion mailing list