[Numpy-discussion] [Pytables-users] On Numexpr and uint64 type

Francesc Altet faltet@carabos....
Tue Mar 11 04:44:03 CDT 2008


Hi Marteen,

A Monday 10 March 2008, escriguéreu:
> > Solution 1) is appealing because is how NumPy works, but I don't
> > personally like the upcasting to float64.  First of all, because
> > you transparently convert numbers potentially loosing the least
> > significant
> > digits.  Second, because an operation between integers gives a
> > float as
> > a result, and this is different for typical programming languages.
>
> For what it is worth, Py3K will change this behaviour.
> See http://www.python.org/dev/peps/pep-3100/ and PEP 238.
> While it is different from all current languages, that doesn't mean
> it is
> a good idea to floor() all integer divisions (/me ducks for cover).
>
> > We are mostly inclined to implement 2) behaviour, but before
> > proceed, I'd like to know what other people think about this.
>
> While Py3K is still a while away, I think it is good to keep it in
> mind with new developments.

Thanks for the remind about the future of the division operator in Py3k.  
However, the use of the / operator in this example is mostly anecdotal.  
The most important point here is how to cast (or not to cast) the types 
different than uint64 in order to operate with them.

The thing that makes uint64 so special is that it is the largest integer 
(in current processors) that has a native representation (i.e. the 
processor can operate directly on them, so they can be processed very 
fast), and besides, there is no other (common native) type that can 
fully include all its precision (float64 has a mantissa of 53 bits, so 
this is not enough to represent 64 bits).  So the problem is basically 
what to do when operations with uint64 have overflows (or underflows, 
like for example, dealing with negative values).

In some sense, int64 has exactly the same problem, and typical languages 
seem to cope with this by using modular arithmetic (as Charles Harris 
graciously pointed out).  Python doesn't need to rely on this, because 
in front of an overflow in native integers the outcome is silently 
promoted to a long int, which has an infinite precision in python (at 
the expense of much slower performance in operations and more space 
required to store it).  However, NumPy and Numexpr (as well as PyTables 
itself) are all about performance and space efficency, so going to 
infinite precision is a no go.

So, for me, it is becoming more and more clear that implementing support 
for uint64 (and probably int64) as a non-upcastable type, with the 
possible addition of casting operators (uint64->int64 and 
int64->uint64, and also probably int-->int64 and int-->uint64), as has 
been suggested by Timothy Hochberg in the NumPy list, and adopting 
modular arithmetic for dealing with overflows/underflows is probably 
the most sensible solution.  I don't know how difficult it would be to 
implement this, however.

Cheers,

-- 
>0,0<   Francesc Altet     http://www.carabos.com/
V   V   Cárabos Coop. V.   Enjoy Data
 "-"


More information about the Numpy-discussion mailing list