[Numpy-discussion] using NaN, INT_MIN etc in ndarray instead of a masked array

Christian Marquardt christian at marquardt.sc
Tue Apr 18 14:48:06 CDT 2006


On Tue, April 18, 2006 19:36, Gary Strangman wrote:
>
>>> Not true. R supports "NA" for all its types except raw bytes.
>>> For example:
> (snip)
>>
>> For Boolean values there is "room" for a NA value, but what about
>> arbitrary
>> integers.  Does R just limit the range of the integer value?  That's
>> what I
>> meant:  "fiddling with special-values" doesn't generalize to all
>> data-types.
>
> In R, I believe NA = -sys.maxint-1

Don't know if this helps, but I have found the following in the R Data
Import/Export Manual (in section 6.5.1, available at
http://cran.r-project.org/doc/manuals/R-data.html):

   The missing value for R logical and integer types is INT_MIN, the
   smallest representable int defined in the C header limits.h, normally
   corresponding to the bit pattern 0xffffffff.

For doubles (I think R only uses double precision internally), it's a bit
more complex apparently; in the section mentioned above, the authors
explain that

   [If R's internal constant definitions / library functions can't be used],
   on all common platforms IEC 60559 (aka IEEE 754) arithmetic is used, so
   standard C facilities can be used to test for or set Inf, -Inf and NaN
   values. On such platforms NA is represented by the NaN value with
   low-word 0x7a2 (1954 in decimal).

The implementation of the floating point NA value is done in the file
arithmetics.c of the R source code; the relevant code snippets defining
the NA "value" are (I believe)

   typedef union
   {
       double value;
       unsigned int word[2];
   } ieee_double;

   #ifdef WORDS_BIGENDIAN
   static CONST int hw = 0;
   static CONST int lw = 1;
   #else  /* !WORDS_BIGENDIAN */
   static CONST int hw = 1;
   static CONST int lw = 0;
   #endif /* WORDS_BIGENDIAN */

   static double R_ValueOfNA(void)
   {
       /* The gcc shipping with RedHat 9 gets this wrong without
        * the volatile declaration. Thanks to Marc Schwartz. */
       volatile ieee_double x;
       x.word[hw] = 0x7ff00000;
       x.word[lw] = 1954;
       return x.value;
   }

and the tests for a number being NA or NaN are

   int R_IsNA(double x)
   {
       if (isnan(x)) {
           ieee_double y;
           y.value = x;
           return (y.word[lw] == 1954);
       }
       return 0;
   }

   int R_IsNaN(double x)
   {
       if (isnan(x)) {
           ieee_double y;
           y.value = x;
           return (y.word[lw] != 1954);
       }
       return 0;
   }

Hope this is useful,

  Christian.






More information about the Numpy-discussion mailing list