[SciPy-dev] Question about 64-bit integers being cast to double precision

Travis Oliphant oliphant at ee.byu.edu
Mon Oct 10 11:38:03 CDT 2005

Stephen Walton wrote:

>Travis Oliphant wrote:
>>In scipy (as in Numeric), there is the concept of "Casting safely" to a 
>>type.  This concept is used when choosing a ufunc, for example. 
>>My understanding is that a 64-bit integer cannot be cast safely to a 
>>double-precision floating point number, because precision is lost in the 
>>conversion...The result is that on 64-bit systems, the long double type gets used a 
>>lot more.   Is this acceptable? expected?   What do those of you on 
>>64-bit systems think?
>I am not on a 64 bit system but can give you the perspective of someone 
>who's thought a lot about floating point precision in the context of 
>both my research and of teaching classes on numerical analysis for 
>physics majors.  To take your example, and looking at it from an 
>experimentalist's viewpoint, sqrt(2) where 2 is an integer has only one 
>significant figure, and so casting it to a long double seems like 
>extreme overkill.  
I agree, which is why it concerned me when I saw it.  But, it is 
consistent with the rest of the casting features. 

>With all that, my vote on Travis's specific question:  if conversion of 
>an N-bit integer in scipy_core is required, it gets converted to an 
>N-bit float.  The only cases in which precision will be lost is if the 
>integer is large enough to require more than (N-e) bits for its 
>representation, where e is the number of bits in the exponent of the 
>floating point representation. 

Yes, it is only for large integers that problems arise.   I like this 
scheme and it would be very easy to implement, and it would provide a 
consistent interface.

The only problem is that it would mean that on current 32-bit systems

sqrt(2)  would cast 2 to a "single-precision" float and return a 
single-precision result.

If that is not a problem, then great...

Otherwise, a more complicated (and less consistent) rule like

integer             float
8-bit              32-bit
16-bit            32-bit
32-bit            64-bit
64-bit            64-bit

would be needed (this is also not too hard to do).


More information about the Scipy-dev mailing list