[Numpy-discussion] comparing arrays with NaN in them.

Pierre GM pgmdevlist@gmail....
Fri Aug 24 12:03:40 CDT 2007


All,

Using the maskedarray package:
>>>import maskedarray as ma
>>>x = numpy.array([1,numpy.nan,3])
>>>y = numpy.array([1,numpy.nan,3])
>>>ma.allclose(ma.array(x,mask=numpy.isnan(x)),ma.array(y,mask=numpy.isnan(y)) )
True

or even simpler:
>>> maskedarray.testutils.assert_equal(x,y)

#........................................

> What's the status of the two masked array implementations? 

One is official but no longer really supported (numpy.ma), one is still 
unofficial but fully functional (maskedarray), and supported (by me at 
least). My understanding is that maskedarray will stay in the sandbox as long 
as we don't have enough feedback from users.


> Which should 
> I use? Unless there are huge feature differences (which I don't think
> there are), 

Actually there is at least one big difference:

the masked arrays you get from numpy.ma are NOT ndarrays. Therefore, a code 
like:
>>>numpy.asanyarray(numpy.ma.array([1,2,3],mask=[0,1,0]))
array([1, 2, 3])
loses your mask.

On the other side, the maskedarray package (still in the sandbox) implements 
masked arrays as a subclass of ndarrays, so:
>>>numpy.asanyarray(maskedarray.array([1,2,3],mask=[0,1,0]))
masked_array(data = [1 -- 3],
      mask = [False  True False],
      fill_value=999999)

Apart from that, maskedarray implements more functions and methods than are 
available in numpy.ma.

> then I want to use the one that's going to get maintained 
> into the future -- do we know yet which that will be?

I've already committed myself to the support of maskedarray for the time 
being.

Eric Firing and I have been in contact over the last few weeks about how to 
optimize maskedarray, for example by porting part of the code to C. There are 
still a couple of conceptual issues we need to address first, as presented in 
another thread


More information about the Numpy-discussion mailing list