[Numpy-discussion] Re: ndarray.fill and ma.array.filled
tim.hochberg at cox.net
Fri Apr 7 11:22:05 CDT 2006
> I am posting a reply to my own post in a hope to generate some
> discussion of the original proposal.
> I am proposing to add a "filled" method to ndarray. This can be a
> pass-through, an alias to "copy" or a method to replace nans or some
> other type-specific values. This will allow code that uses "filled"
> work on
> ndarrays without changes.
In general, I'm skeptical of adding more methods to the ndarray object
-- there are plenty already.
In addition, it appears that both the method and function versions of
filled are "dangerous" in the sense that they sometimes return the array
itself and sometimes a copy.
Finally, changing ndarray to support masked array feels a bit like the
tail wagging the dog.
Let me throw out an alternative proposal. I will admit up front that
this proposal is based on exactly zero experience with masked array, so
there may be some stupidities in it, but perhaps it will lead to an
def asUnmaskedArray(obj, fill_value=None):
mask = getattr(obj, False)
if mask is False:
if fill_value is None:
fill_value = obj.get_fill_value()
newobj = obj.data().copy()
newobj[mask] = fill_value
Or something like that anyway. This particular version should work on
any array as long as if it exports a mask attribute it also exports
get_fill_value and data. At least once any bugs are ironed out, I
haven't tested it.
ma would have to be modified to use this instead of using filled
everywhere, but that seems more appropriate than tacking on another
method to ndarray IMO.
On advantage of this approach is that most array like objects that don't
subclass ndarray will work with this automagically. If we keep expanding
the methods of ndarray, it's harder and harder to implement other array
like objects since they have to implement more and more methods, most of
which are irrelevant to their particular case. The more we can implement
stuff like this in terms of some relatively small set of core
primitives, the happier we'll all be in the long run. This also builds
on the idea of trying to push as much of the array/view ambiguity into
the asXXXArray corner.
> On 3/22/06, *Sasha* <ndarray at mac.com <mailto:ndarray at mac.com>> wrote:
> In an ideal world, any function that accepts ndarray would accept
> ma.array and vice versa. Moreover, if the ma.array has no masked
> elements and the same data as ndarray, the result should be the same.
> Obviously current implementation falls short of this goal, but there
> is one feature that seems to make this goal unachievable.
> This feature is the "filled" method of ma.array. Pydoc for this
> method reports the following:
> | filled(self, fill_value=None)
> | A numeric array with masked values filled. If fill_value is
> | use self.fill_value().
> | If mask is nomask, copy data only if not contiguous.
> | Result is always a contiguous, numeric array.
> | # Is contiguous really necessary now?
> That is not the best possible description ("filled" is "filled"), but
> the essence is that the result of a.filled(value) is a contiguous
> ndarray obtained from the masked array by copying non-masked elements
> and using value for masked values.
> I would like to propose to add a "filled" method to ndarray. I see
> several possibilities and would like to hear your opinion:
> 1. Make filled simply return self.
> 2. Make filled return a contiguous copy.
> 3. Make filled replace nans with the fill_value if array is of
> floating point type.
> Unfortunately, adding "filled" will result is a rather confusing
> situation where "fill" and "filled" both exist and have very different
> I would like to note that "fill" is a somewhat odd ndarray method.
> AFAICT, it is the only non-special method that mutates the array. It
> appears to be just a performance trick: the same result can be
> with "a[...] = ".
More information about the Numpy-discussion