[Numpy-discussion] uint64 typecasting with scalars broken (?)
Charles R Harris
charlesr.harris@gmail....
Mon Apr 23 22:27:39 CDT 2007
On 4/23/07, Travis Oliphant <oliphant.travis@ieee.org> wrote:
>
> Christian Marquardt wrote:
> > Hello,
> >
> > The following is what I expected...
> >
> > >>> y = 1234
> > >>> x = array([1], dtype = "uint64")
> > >>> print x + y, (x + y).dtype.type
> > [1235] <type 'numpy.uint64'>
> >
> >
>
> This is "what you expect" only because y is a scalar and cannot
> determine the "kind" of the output.
>
> > but is this the way it should be? (numpy 1.0.2, Linux, Intel comilers)
> >
> > >>> print x[0] + y, type(x[0] + y)
> > 1235.0 <type 'numpy.float64'>
> >
>
> This is correct (sort of) because in a mixed operation between uint64
> and int32, because there is no int128, the sum must be placed in a
> float. In reality it should be a long-double float but it was decided
> not to perpetuate long double floats like this because then on 64-bit
> platforms they would be showing up everywhere.
I wonder if returning int64 wouldn't be better in this case. It has more
precision than a double and has the advantage of being an integer. True,
uint64s with the msb set would be wrongly interpreted, but... Or maybe throw
an error when mixing the two, since really the result can't be relied on. If
the latter, it would still be nice have an interpretation for python
integers, so just interpret them as uint64 (I believe a C type cast does
this) and just add using the usual modular arithmetic. This still allows
incrementing and decrementing. For instance, using uint2 as an example,
3 - 1 == 3 + (-1) == 0x11 + 0x11 == 0x10 == 2
I think this makes the most sense, after all, subtraction is defined for
uints and we already use modular arithmetic. I admit some strange results
can show up, but no stranger than treating the integers as floats having 51
bit precision. I suppose we could raise an error on integer overflow, but
that isn't how we have done things with the other integers.
Of course, all the other mixed integer types with the same number of bits
could be treated the same way, which would at least be consistent. The user
would then have to specify a larger type whenever it was needed and
explicitly deal with the case of uint64. Let's see what C does:
#include <stdio.h>
int main(int argc, char** args)
{
unsigned long long x = 0;
long long y = -1;
printf("%Ld\n", x + y);
return 1;
}
prints -1, as expected, and doesn't issue a compiler warning with -Wall.
Chuck
