[Numpy-discussion] behavior of masked arrays

Travis E. Oliphant oliphant@enthought....
Fri Mar 7 11:32:30 CST 2008


Giorgio F. Gilestro wrote:
> Hi Everybody,
> I have some arrays that sometimes need to have some of their values 
> masked away or, simply said, not considered during manipulation.
> I tried to fulfill my purposes using both NaNs and MaskedArray but 
> neither of them really helped completely.
>
> Let's give an example:
>
> from numpy import *
> import scipy
>
> a = array(arange(40).reshape(5,8), dtype=float32)
> b = array(arange(40,80).reshape(5,8), dtype=float32)
> a[1,1] = NaN
>
> tt, ttp = scipy.stats.ttest_ind(a,b,axis=0)
>
> c = numpy.ma.masked_array(a, mask=isnan(a))
> tt1, ttp1 = scipy.stats.ttest_ind(c,b,axis=0)
>
> print (ttp == ttp1).all()
>
> will return True.
>
> My understanding is that only a few functions will be able to properly 
> use MA during execution. Is this correct or am I missing something here?
>   

Yes, that is correct.  A function that supports masked arrays natively 
requires that it be understood from the beginning.  The concept of a 
masked array is not understood by most of the functions that NumPy and 
SciPy provide. 

There is a price to be paid for checking on the validity of the data for 
every function and so people differ on whether or not there *should* be 
support for masked arrays on a very low level.

I support the concept of separate masked-array functions which do not 
penalize non masked array functions significantly (perhaps Generic 
functions can help us here so that the interface to the user is the 
same, but the underlying function called is different depending on 
whether or not the array is masked.   As long as this is done per array 
and not per element it is usually not significant.

-Travis O.


> Thanks
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>   



More information about the Numpy-discussion mailing list