[SciPy-dev] operations on int8 arrays
Ed Schofield
schofield at ftw.at
Wed Oct 19 17:24:21 CDT 2005
On Wed, 19 Oct 2005, Travis Oliphant wrote:
> Jon Peirce wrote:
>
> >>Scipy arrays with dtype=uint8 or int8 seem to be
> >>mathematically-challenged on my machine (AMD64 WinXP running python
> >>2.4.2, scipy core 0.4.1). Simple int (and various others) appear fine.
> >>>>> >>>import scipy
> >>>>> >>>xx=scipy.array([100,100,100],scipy.int8)
> >>>>> >>>print xx.sum()
> >> 44
> >>>>> >>>xx=scipy.array([100,100,100],scipy.int)
> >>>>> >>>print xx.sum()
> >> 300
>
> This is not a bug. In the first line, you are telling the computer to
> add up 8-bit integers. The result does not fit in an 8-bit integer ---
> thus you are computing modulo 256.
I was bitten by this back in April:
http://www.scipy.org/mailinglists/mailman?fn=scipy-dev/2005-April/002937.html
I wasted several hours then trying to hunt down bugs in my code, before I
finally realized that my sum() call was responsible. I strongly believe
that the default should be changed here to upcast by default. My reasons
are:
1. Python would do the same: it 'just works', upcasting where necessary
from int to big integer and, in the future, making division with two int
arguments return a float. We also want to avoid differences between
Python's sum() and scipy's sum():
>>> a = scipy.array([100,100, 100], scipy.int8)
>>> sum(a)
300
>>> scipy.sum(a)
44
2. the result of sum() or mean() without any modulo arithmetic would be a
python int or float, and it seems reasonable that the result is accurate
to the width of the output type.
3. the advantage in space efficiency of using a smaller type for
accumulated operations is minimal (perhaps unlike an operation whose output
is an array).
> It would be possible to make the default reduce type for integers 32-bit
> on 32-bit platforms and 64-bit on 64-bit platforms. the long integer type.
As far as I understand, a Python int is always a C long, but a C long
isn't always the platform word length (e.g. is sometimes 32 bit on 64 bit
machines). So perhaps it'd be better to make the default reduce type for
integers a C long?
> Or, this could simply be the default when calling the .sum method (which
> is add.reduce under the covers). The reduce method could stay with the
> default of the integer type.
I think reduce should upcast too.
-- Ed
More information about the Scipy-dev
mailing list