[Numpy-discussion] Re: ndarray.fill and ma.array.filled

Sasha ndarray at mac.com
Mon Apr 10 09:49:05 CDT 2006


On 4/10/06, Bruce Southey <bsouthey at gmail.com> wrote:
>
> [...]
> I think the issue related to how masked values should be handled in
> computation. Does it matter if the result of an operation is due to a
> masked value or numerical problem (like dividing by zero)? (I am
> presuming that it is possible to identify this difference.) If not,
> then I support the idea of treating masked values as NaN.
>

IEEE standard prvides plenty of spare bits in NaNs to represent pretty
much everything, and some languages take advantage of that feature. (I
believe NA and NaN are distinct in R). In MA, however mask elements
are boolean and no distinction is made between various reasons for not
having a data element.  For consistency, a non-trivial (not always
false) implementation of ndarray.mask should return "not finite" and
ignore bits that distinguish NaNs and infinities.

> >The functionality provided by na.actions can always be achieved
> > by calling an extra function (filled or compress).
>
> I am not clear on what you actually mean here.  For example, if you
> are summing across a particular dimension, I would presume that any
> masked value would be ignored an  that there would be some record of
> the fact that a masked value was encountered. This would allow that
> 'extra function' to handle the associated result. Alternatively the
> 'extra function'  would have to be included as an argument - which is
> what the na.actions do.
>
If you sum along a particular dimension and encounter a masked value,
the result is masked.  The same is true if you encounter a NaN - the
result is NaN.  If you would like to ignore masked values, you write
a.filled(0).sum() instead of a.sum(). In 1d case, you can also use
a.compress().sum().  In other words, what in R you achieve with a
flag, such as in sum(a, na.rm=TRUE), in numpy you achieve by an
explicit call to "fill".  This is not quite the same as na.actions in
R, but that is what I had in mind.




More information about the Numpy-discussion mailing list