[Numpy-discussion] Why does np.nan{min, max} clobber my array mask?

David Carmean dlc@halibut....
Sat Feb 13 21:04:10 CST 2010


I'm just starting to work with masked arrays and I've found some behavior that 
definitely does not follow the Principle of Least Surprise:

I've generated a 2-d array from a list of lists, where the elements are floats with 
a good number of NaNs.  Inspections shows the expected numbers for ma.count() and 
ma.count_masked().

However, as soon as I run np.nanmin() or np.nanmax() over it, all of the mask elements 
are reset to False.


    (Pdb) flat = flatten(uut)	# my own utility function
    (Pdb) len ( [ x for x in flat if x+0 == x ] )  # only way I could figure to detect 
    4086
    (Pdb) len ( [ x for x in flat if x+0 != x ] )  # 1458 NaNs in the set.
    1458
    (Pdb) msk = ma.masked_invalid(uut)		
    (Pdb) msk.shape
    (99, 56)
    (Pdb) ma.count(msk)
    4086
    (Pdb) ma.count_masked(msk)
    1458
    (Pdb) msk.hardmask			
    False
    (Pdb) msk.harden_mask()		# harden the mask first, for demo
    masked_array(data =....
    (Pdb) msk.hardmask
    True
    (Pdb) rslt_hm = np.nanmin(msk, axis=1)
    (Pdb) rslt_hm.shape
    (99,)
    (Pdb) ma.count_masked(rslt_hm)
    0
    (Pdb) ma.count(rslt_hm)
    99
    # Is my original still OK?
    msk
    masked_array(data = ...
    ... [False False False ...,  True  True  True]],
           fill_value = 1e+20)
    (Pdb) msk.soften_mask() 		#  now re-soften the mask:
    masked_array(data = ....
    (Pdb) rslt_softmask = np.nanmin(msk, axis=1)
    (Pdb) rslt_softmask.shape
    (99,)
    (Pdb) msk.mask.any()
    False
    # BAM!   note:  'control' is a hardmasked control copy: 
    (Pdb) control.mask.any()
    True

As the above shows, I discovered that I can work around this by setting the hardmask 
property, but ... there is no mention of such a side-effect in the docs (including 
the brand-new reference book).

Have I found a bug?  This is 1.4.0 running under 64-bit Windows 7 ( Python(x,y) distribution).







More information about the NumPy-Discussion mailing list