[Numpy-discussion] average() or mean() errors
Christopher Hanley
chanley at stsci.edu
Fri Jan 26 11:38:50 CST 2007
I filed a similar bug report the other day. I believe that it has to do
with the default size of the accumulator variable in the algorithms
being used. Please see the following example,
Python 2.4.3 (#2, Dec 7 2006, 11:01:45)
[GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as n
>>> a = n.array([132.00004,132.00006,132.00005],dtype=n.float32)
>>> a.mean()
132.00006103515625
>>> a = n.array([132.00004,132.00006,132.00005],dtype=n.float64)
>>> a.mean()
132.00004999999999
>>>
In the first case, the calculation is done in single precision since
that is the type of the input arrays. The second case the calculation
is double precision. I think this is the effect that is being seen.
The workaround would be to say numpy.average(obj,dtype=numpy.float64).
Chris
Stefan van der Walt wrote:
> On Tue, Jan 23, 2007 at 08:29:47PM -0500, Daniel Smith wrote:
>
>> When calling the average() or mean() functions on a small array (3
>> numbers), I am seeing significant numerical errors (on the order of 1%
>> with data to 8 significant digits). The code I am using is essentially:
>>
>> A = zeros(3)
>> A[i] = X
>> B = average(A)
>>
>
> I'm not sure I understand:
>
> In [7]: A = N.zeros(3)
>
> In [8]: A[1] = 3.
>
> In [9]: N.average(A)
> Out[9]: 1.0
>
> In [11]: A[0] = 2.
>
> In [12]: N.average(A)
> Out[12]: 1.66666666667
>
> In [13]: (2+3+0)/3.
> Out[13]: 1.6666666666666667
>
> In [14]: for i in range(1000):
> ....: A = N.random.rand(3)
> ....: assert N.average(A) == N.sum(A)/3.
>
> Maybe you can give a specific code snippet?
>
> Cheers
> Stéfan
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
More information about the Numpy-discussion
mailing list