[Numpy-discussion] feedback request: proposal to add masks to the core ndarray
Fri Jun 24 07:30:07 CDT 2011
On 2011-06-24 13:59, Nathaniel Smith <firstname.lastname@example.org> wrote:
> On Thu, Jun 23, 2011 at 5:56 PM, Benjamin Root<email@example.com> wrote:
>> Lastly, I am not entirely familiar with R, so I am also very curious about
>> what this magical "NA" value is, and how it compares to how NaNs work.
>> Although, Pierre brought up the very good point that NaNs woulldn't work
>> anyway with integer arrays (and object arrays, etc.).
> Since R is designed for statistics, they made the interesting decision
> that *all* of their core types have a special designated "missing"
> value. At the R level this is just called "NA". Internally, there are
> a bunch of different NA values -- for floats it's a particular NaN,
> for integers it's INT_MIN, for booleans it's 2 (IIRC), etc. (You never
> notice this, because R will silently cast a NA of one type into NA of
> another type whenever needed, and they all print the same.)
> Because any array can contain NA's, all R functions then have to have
> some way of handling this -- all their integer arithmetic knows that
> INT_MIN is special, for instance. The rules are basically the same as
> for NaN's, but NA and NaN are different from each other (because one
> means "I don't know, could be anything" and the other means "you tried
> to divide by 0, I *know* that's meaningless").
> That's basically it.
> -- Nathaniel
Would the use of R's system for expressing "missing values" be possible
in numpy through a special flag ?
Any given numpy array could have a boolean flag (say "na_aware")
indicating that some of the values are representing a missing cell.
If the exact same system is used, interaction with R (through something
like rpy2) would be simplified and more robust.
PS: In R, dividing one by zero returns +/-Inf, not NaN. 0/0 returns NaN.
More information about the NumPy-Discussion