[Numpy-discussion] NA masks in the next numpy release?
Charles R Harris
Thu Oct 27 21:08:37 CDT 2011
On Thu, Oct 27, 2011 at 7:16 PM, Travis Oliphant <firstname.lastname@example.org>wrote:
> That is a pretty good explanation. I find myself convinced by Matthew's
> arguments. I think that being able to separate ABSENT from IGNORED is a
> good idea. I also like being able to control SKIP and PROPAGATE (but I
> think the current implementation allows this already).
> What is the counter-argument to this proposal?
What exactly do you find convincing? The current masks propagate by default:
In : a = ones(5, maskna=1)
In : a = NA
In : a
Out: array([ 1., 1., NA, 1., 1.])
In : a + 1
Out: array([ 2., 2., NA, 2., 2.])
In : a = 10
In : a
Out: array([ 1., 1., 10., 1., 1.], maskna=True)
I don't see an essential difference between the implementation using masks
and one using bit patterns, the mask when attached to the original array
just adds a bit pattern by extending all the types by one byte, an approach
that easily extends to all existing and future types, which is why Mark went
that way for the first implementation given the time available. The masks
are hidden because folks wanted something that behaved more like R and also
because of the desire to combine the missing, ignore, and later possibly bit
patterns in a unified manner. Note that the pseudo assignment was also meant
to look like R. Adding true bit patterns to numpy isn't trivial and I
believe Mark was thinking of parametrized types for that.
The main problems I see with masks are unified storage and possibly memory
use. The rest is just behavor and desired API and that can be adjusted
within the current implementation. There is nothing essentially masky about
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion