[Numpy-discussion] [OT] A weekend floating point/compiler question

Fernando Perez Fernando.Perez at colorado.edu
Fri Apr 28 16:21:17 CDT 2006


Hi all,

this is somewhat off-topic, since it's really a gcc/g77 question.  Yet for us 
here (my group) it may lead to the decision to stop using g77 for all fortran 
code and switch to another compiler for our python-wrapped libraries.  So it 
did arise in the context of python usage of in-house code, and I'm appealing 
to anyone who may want to play a little with the question and help.

Feel free to reply off-list to keep the noise down on the list.

The problem arose in some in-house library, but can be boiled down to this:

planck[f77bug]> cat testbug.f
       program  testbug
c
       implicit real *8 (a-h,o-z)
c
       half = 0.5d0
       x    = 0.49d0
       nnx  = 100
       iax  = (x+half)*nnx

       print *, 'Should be 99:',iax

       stop
       end

c EOF


planck[f77bug]> g77 -o testbug.g77 testbug.f
planck[f77bug]> ./testbug.g77
  Should be 99: 98

This can be seen as computing (x/n+1/2)*n and comparing it to x+n/2.  Yes, I 
know about the dangers of floating point roundoff error (I didn't write the 
original code), but a variation of this is used inside a library that began 
crashing for certain inputs.  The point is that this same code works fine with 
the Intel and Lahey compilers, but not with g77.  Now, to add a bit of mystery 
to the question, I wrote the following C code:

planck[f77bug]> cat scanbug.c
#include <stdio.h>

int main(int argc, char* argv[]) {

     double x;
     double eps  = 1e-2;
     double x0   = 0.0;
     double xmax = 1.0;

     int nnx = 100;
     int i = 0;

     double dax;
     int iax,iax_direct;

     x = x0;
     while (x<xmax) {
         // This operation:
         dax = nnx*(x+0.5);
         iax = dax;

         // And this one:
         iax_direct = nnx*(x+0.5);

         // look identical, it's jut that one of them does not use a temporary
         // double variable to hold the result ( Does this mean that the int
         // cast is done in a register straight out of the 80-bit value in the
         // FPU? )

         // And yet, they produce different results for certain values of x
         if (iax != iax_direct) {
             printf("ERROR at x=%e!\n",x);
         }
         x = x0 + i*eps;
         i += 1;
     }
}
// EOF

And this is really where my question is.  The key issue is that nx*(x+0.5) 
produces a different result when truncated to an int, depending on whether a 
temporary double is involved or not.  I tested with the Intel C compiler, and 
it does never report a mismatch, yet a gcc compilation (3.4.3 and 4.0.2) does 
report a number of them.

Any ideas/comments?  Shouldn't the result be independent of the intermediate 
double var?  It is for icc, can this be considered a gcc bug?

Cheers,

f




More information about the Numpy-discussion mailing list