[Numpy-discussion] feedback request: proposal to add masks to the core ndarray

Nathaniel Smith njs@pobox....
Fri Jun 24 11:28:33 CDT 2011


On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern <robert.kern@gmail.com> wrote:
> The alternative proposal would be to add a few new dtypes that are
> NA-aware. E.g. an nafloat64 would reserve a particular NaN value
> (there are lots of different NaN bit patterns, we'd just reserve one)
> that would represent NA. An naint32 would probably reserve the most
> negative int32 value (like R does). Using the NA-aware dtypes signals
> that you are using NA values; there is no need for an additional flag.

For floats, this is easy, because NaN's are already built in. For
integers, I worry a bit, because we'd have to break the usual two's
complement arithmetic. int32 is closed under
addition/multiplication/bitops. But for naint32, what's INT_MAX + 1?

(In R, the answer is that *all* integer overflows are tested for and
become NA, whether they would happen to land on INT_MIN or not, and
AFAICT there are no bitops for integers.)

For strings in the numpy context, just adding another byte to hold the
NA-ness flag seems more sensible than stealing some random string.

In both cases, the more generic maybe() dtype I suggested might be cleaner.

-- Nathaniel


More information about the NumPy-Discussion mailing list