[Numpy-discussion] MaskedArray and Record Arrays

Alexander Michael lxander.m@gmail....
Thu Jan 3 14:41:16 CST 2008


I am experimenting with the new MaskedArray (from
<http://svn.scipy.org/svn/numpy/branches/maskedarray>) as a
replacement for my own home-brewed masked data handling mechanisms. In
what I have built myself, I often work with record arrays that have a
single mask for the whole record (no fieldmask). It seems like it is
almost possible to do this with MaskedArray:

>>> a = masked_array(
...     [(1,10,100),(0,0,0),(2,20,200)],
...     mask=[False,True,False],
...     dtype=[('x',float), ('y',float), ('c',int)]
... )

masked_array(data = [(1.0, 10.0, 100) -- (2.0, 20.0, 200)],
      mask = [False  True False],
      fill_value=???)

except MaskedArray.__getitem__ doesn't check to see if I'm asking for
a field instead of an index:

>>> a['x']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "qlab\ma\core.py", line 1269, in __getitem__
    dout._mask = ndarray.__getitem__(m, indx).reshape(dout.shape)
ValueError: field named x not found.

modifying ma.core.py so that

    def __getitem__(self, indx):
        ...
            if m is not nomask:
                if not isinstance(indx, basestring):
                    dout._mask = ndarray.__getitem__(m,
indx).reshape(dout.shape)
        ...

gets me a little father. Any plans to make this style of record array
usage work with the new MaskedArray? I saw and experimented with
mrecords, but I am concerned about the performance overhead as I do
not require the granularity of fieldmask. Nevertheless, I will
continue with mrecords since it does what I want (and it is just
performance that I am worried about) and see how it works for me, but
I thought I would toss this other usage style out there because it
might be easy.

Thanks,
Alex


More information about the Numpy-discussion mailing list