[Numpy-discussion] large float32 array issue

Warren Weckesser warren.weckesser@enthought....
Wed Nov 3 06:52:58 CDT 2010


On Wed, Nov 3, 2010 at 6:39 AM, Vincent Schut <schut@sarvision.nl> wrote:

>
>
> On 11/03/2010 12:31 PM, Warren Weckesser wrote:
> >
> >
> > On Wed, Nov 3, 2010 at 5:59 AM, Warren Weckesser
> > <warren.weckesser@enthought.com <mailto:warren.weckesser@enthought.com>>
> > wrote:
> >
> >
> >
> >     On Wed, Nov 3, 2010 at 3:54 AM, Vincent Schut <schut@sarvision.nl
> >     <mailto:schut@sarvision.nl>> wrote:
> >
> >         Hi, I'm running in this strange issue when using some pretty
> large
> >         float32 arrays. In the following code I create a large array
> >         filled with
> >         ones, and calculate mean and sum, first with a float64 version,
> then
> >         with a float32 version. Note the difference between the two. NB
> the
> >         float64 version is obviously right :-)
> >
> >
> >
> >         In [2]: areaGrid = numpy.ones((11334, 16002))
> >         In [3]: print(areaGrid.dtype)
> >         float64
> >         In [4]: print(areaGrid.shape, areaGrid.min(), areaGrid.max(),
> >         areaGrid.mean(), areaGrid.sum())
> >         ((11334, 16002), 1.0, 1.0, 1.0, 181366668.0)
> >
> >
> >         In [5]: areaGrid = numpy.ones((11334, 16002), numpy.float32)
> >         In [6]: print(areaGrid.dtype)
> >         float32
> >         In [7]: print(areaGrid.shape, areaGrid.min(), areaGrid.max(),
> >         areaGrid.mean(), areaGrid.sum())
> >         ((11334, 16002), 1.0, 1.0, 0.092504406598019437, 16777216.0)
> >
> >
> >         Can anybody confirm this? And better: explain it? Am I running
> >         into a
> >         for me till now hidden ieee float 'feature'? Or is it a bug
> >         somewhere?
> >
> >         Btw I'd like to use float32 arrays, as precision is not really
> >         an issue
> >         in this case, but memory usage is...
> >
> >
> >         This is using python 2.7, numpy from git (yesterday's checkout),
> >         on arch
> >         linux 64bit.
> >
> >
> >
> >     The problem kicks in with an array of ones of size 2**24.  Note that
> >     np.float32(2**24) + np.float32(1.0) equals np.float32(2**24):
> >
> >
> >     In [41]: b = np.ones(2**24, np.float32)
> >
> >     In [42]: b.size, b.sum()
> >     Out[42]: (16777216, 16777216.0)
> >
> >     In [43]: b = np.ones(2**24+1, np.float32)
> >
> >     In [44]: b.size, b.sum()
> >     Out[44]: (16777217, 16777216.0)
> >
> >     In [45]: np.spacing(np.float32(2**24))
> >     Out[45]: 2.0
> >
> >     In [46]: np.float32(2**24) + np.float32(1)
> >     Out[46]: 16777216.0
> >
> >
> >
> >
> > By the way, you can override the dtype of the accumulator of the mean()
> > function:
> >
> > In [61]: a = np.ones((11334,16002),np.float32)
> >
> > In [62]: a.mean()  # Not correct
> > Out[62]: 0.092504406598019437
> >
> > In [63]: a.mean(dtype=np.float64)
> > Out[63]: 1.0
>
> Thanks for this. That at least gives me a temporary solution (I actually
> need sum() instead of mean(), but the trick works for sum too).
>


sum() also has the dtype argument.

Warren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20101103/ed08c241/attachment.html 


More information about the NumPy-Discussion mailing list