[Numpy-discussion] ndarray.fill and ma.array.filled

Tim Hochberg tim.hochberg at cox.net
Wed Mar 22 15:59:23 CST 2006


Eric Firing wrote:

> Sasha wrote:
>
>> In an ideal world, any function that accepts ndarray would accept
>> ma.array and vice versa.  Moreover, if the ma.array has no masked
>> elements and the same data as ndarray, the result should be the same. 
>
>
> This would be *very* nice.
>
>> Obviously current implementation falls short of this goal, but there
>> is one feature that seems to make this goal unachievable.
>>
>> This feature is the "filled" method of ma.array.  Pydoc for this
>> method reports the following:
>>
>>  |  filled(self, fill_value=None)
>>  |      A numeric array with masked values filled. If fill_value is 
>> None,
>>  |                 use self.fill_value().
>>  |
>>  |                 If mask is nomask, copy data only if not contiguous.
>>  |                 Result is always a contiguous, numeric array.
>>  |      # Is contiguous really necessary now?
>>
>>
>> That is not the best possible description ("filled" is "filled"), but
>> the essence is that the result of a.filled(value) is a contiguous
>> ndarray obtained from the masked array by copying non-masked elements
>> and using value for masked values.
>>
>> I would like to propose to add a "filled" method to ndarray.  I see
>> several possibilities and would like  to hear your opinion:
>>
>> 1. Make filled simply return self.
>>
>> 2. Make filled return a contiguous copy.
>>
>> 3. Make filled replace nans with the fill_value if array is of
>> floating point type.
>
>
> It seems to me that any function or method that returns an array from 
> an array should be perfectly consistent and explicit about whether it 
> makes a copy or not.  Sometimes the filled method *needs* to return a 
> copy; therefore it should *always* return a copy, regardless of the 
> presence or state of masking. Hence I think the filled method of ma 
> needs to be changed in this way also.

+1

(mumble mumble reshape mumble)

>
> The question for your suggestion 3 is, should a nan always be the 
> equivalent of a masked value?  One loses a little flexibility, but it 
> has an appealing simplicity to it.  I could be persuaded otherwise, 
> but right now I would vote for it.

-0 (I'd be -1, but I don't really use masked arrays).

While masked array should be convenient for masking. The basic ndarray 
object should be as flexible as possible since it's the building block 
for all the rest.


-tim






More information about the Numpy-discussion mailing list