# [SciPy-dev] Question about 64-bit integers being cast to double precision

Travis Oliphant oliphant at ee.byu.edu
Mon Oct 10 11:38:03 CDT 2005

```Stephen Walton wrote:

>Travis Oliphant wrote:
>
>
>
>>In scipy (as in Numeric), there is the concept of "Casting safely" to a
>>type.  This concept is used when choosing a ufunc, for example.
>>
>>My understanding is that a 64-bit integer cannot be cast safely to a
>>double-precision floating point number, because precision is lost in the
>>conversion...The result is that on 64-bit systems, the long double type gets used a
>>lot more.   Is this acceptable? expected?   What do those of you on
>>64-bit systems think?
>>
>>
>>
>>
>I am not on a 64 bit system but can give you the perspective of someone
>who's thought a lot about floating point precision in the context of
>both my research and of teaching classes on numerical analysis for
>physics majors.  To take your example, and looking at it from an
>experimentalist's viewpoint, sqrt(2) where 2 is an integer has only one
>significant figure, and so casting it to a long double seems like
>extreme overkill.
>
I agree, which is why it concerned me when I saw it.  But, it is
consistent with the rest of the casting features.

>With all that, my vote on Travis's specific question:  if conversion of
>an N-bit integer in scipy_core is required, it gets converted to an
>N-bit float.  The only cases in which precision will be lost is if the
>integer is large enough to require more than (N-e) bits for its
>representation, where e is the number of bits in the exponent of the
>floating point representation.
>

Yes, it is only for large integers that problems arise.   I like this
scheme and it would be very easy to implement, and it would provide a
consistent interface.

The only problem is that it would mean that on current 32-bit systems

sqrt(2)  would cast 2 to a "single-precision" float and return a
single-precision result.

If that is not a problem, then great...

Otherwise, a more complicated (and less consistent) rule like

integer             float
==============
8-bit              32-bit
16-bit            32-bit
32-bit            64-bit
64-bit            64-bit

would be needed (this is also not too hard to do).

-Travis

```