[Numpy-discussion] [Pytables-users] On Numexpr and uint64 type
Tue Mar 11 04:44:03 CDT 2008
A Monday 10 March 2008, escriguéreu:
> > Solution 1) is appealing because is how NumPy works, but I don't
> > personally like the upcasting to float64. First of all, because
> > you transparently convert numbers potentially loosing the least
> > significant
> > digits. Second, because an operation between integers gives a
> > float as
> > a result, and this is different for typical programming languages.
> For what it is worth, Py3K will change this behaviour.
> See http://www.python.org/dev/peps/pep-3100/ and PEP 238.
> While it is different from all current languages, that doesn't mean
> it is
> a good idea to floor() all integer divisions (/me ducks for cover).
> > We are mostly inclined to implement 2) behaviour, but before
> > proceed, I'd like to know what other people think about this.
> While Py3K is still a while away, I think it is good to keep it in
> mind with new developments.
Thanks for the remind about the future of the division operator in Py3k.
However, the use of the / operator in this example is mostly anecdotal.
The most important point here is how to cast (or not to cast) the types
different than uint64 in order to operate with them.
The thing that makes uint64 so special is that it is the largest integer
(in current processors) that has a native representation (i.e. the
processor can operate directly on them, so they can be processed very
fast), and besides, there is no other (common native) type that can
fully include all its precision (float64 has a mantissa of 53 bits, so
this is not enough to represent 64 bits). So the problem is basically
what to do when operations with uint64 have overflows (or underflows,
like for example, dealing with negative values).
In some sense, int64 has exactly the same problem, and typical languages
seem to cope with this by using modular arithmetic (as Charles Harris
graciously pointed out). Python doesn't need to rely on this, because
in front of an overflow in native integers the outcome is silently
promoted to a long int, which has an infinite precision in python (at
the expense of much slower performance in operations and more space
required to store it). However, NumPy and Numexpr (as well as PyTables
itself) are all about performance and space efficency, so going to
infinite precision is a no go.
So, for me, it is becoming more and more clear that implementing support
for uint64 (and probably int64) as a non-upcastable type, with the
possible addition of casting operators (uint64->int64 and
int64->uint64, and also probably int-->int64 and int-->uint64), as has
been suggested by Timothy Hochberg in the NumPy list, and adopting
modular arithmetic for dealing with overflows/underflows is probably
the most sensible solution. I don't know how difficult it would be to
implement this, however.
>0,0< Francesc Altet http://www.carabos.com/
V V Cárabos Coop. V. Enjoy Data
More information about the Numpy-discussion