[Numpy-discussion] using NaN, INT_MIN etc in ndarray instead of a masked array
Christian Marquardt
christian at marquardt.sc
Tue Apr 18 14:48:06 CDT 2006
On Tue, April 18, 2006 19:36, Gary Strangman wrote:
>
>>> Not true. R supports "NA" for all its types except raw bytes.
>>> For example:
> (snip)
>>
>> For Boolean values there is "room" for a NA value, but what about
>> arbitrary
>> integers. Does R just limit the range of the integer value? That's
>> what I
>> meant: "fiddling with special-values" doesn't generalize to all
>> data-types.
>
> In R, I believe NA = -sys.maxint-1
Don't know if this helps, but I have found the following in the R Data
Import/Export Manual (in section 6.5.1, available at
http://cran.r-project.org/doc/manuals/R-data.html):
The missing value for R logical and integer types is INT_MIN, the
smallest representable int defined in the C header limits.h, normally
corresponding to the bit pattern 0xffffffff.
For doubles (I think R only uses double precision internally), it's a bit
more complex apparently; in the section mentioned above, the authors
explain that
[If R's internal constant definitions / library functions can't be used],
on all common platforms IEC 60559 (aka IEEE 754) arithmetic is used, so
standard C facilities can be used to test for or set Inf, -Inf and NaN
values. On such platforms NA is represented by the NaN value with
low-word 0x7a2 (1954 in decimal).
The implementation of the floating point NA value is done in the file
arithmetics.c of the R source code; the relevant code snippets defining
the NA "value" are (I believe)
typedef union
{
double value;
unsigned int word[2];
} ieee_double;
#ifdef WORDS_BIGENDIAN
static CONST int hw = 0;
static CONST int lw = 1;
#else /* !WORDS_BIGENDIAN */
static CONST int hw = 1;
static CONST int lw = 0;
#endif /* WORDS_BIGENDIAN */
static double R_ValueOfNA(void)
{
/* The gcc shipping with RedHat 9 gets this wrong without
* the volatile declaration. Thanks to Marc Schwartz. */
volatile ieee_double x;
x.word[hw] = 0x7ff00000;
x.word[lw] = 1954;
return x.value;
}
and the tests for a number being NA or NaN are
int R_IsNA(double x)
{
if (isnan(x)) {
ieee_double y;
y.value = x;
return (y.word[lw] == 1954);
}
return 0;
}
int R_IsNaN(double x)
{
if (isnan(x)) {
ieee_double y;
y.value = x;
return (y.word[lw] != 1954);
}
return 0;
}
Hope this is useful,
Christian.
More information about the Numpy-discussion
mailing list