[Numpy-tickets] [NumPy] #624: Statistics-related array method are not self-consistent.

NumPy numpy-tickets@scipy....
Sun Dec 2 23:50:25 CST 2007


#624: Statistics-related array method are not self-consistent.
------------------------+---------------------------------------------------
 Reporter:  usovalx     |       Owner:  somebody
     Type:  defect      |      Status:  new     
 Priority:  normal      |   Milestone:  1.0.5   
Component:  numpy.core  |     Version:  none    
 Severity:  normal      |    Keywords:          
------------------------+---------------------------------------------------
 Statistincs-related array methods (mean, std) are not self-consistent.
 The data-type of the result does not reflects the way calculations are
 done.
 Consider the example:

 {{{
 a = arange(1, 100, dtype=float32)
 r1 = a.mean()
 r2 = a.astype(float64).mean()

 print(type(r1), r1)
 (<type 'numpy.float64'>, 0.052296744452582464)

 print(type(r2), r2)
 (<type 'numpy.float64'>, 0.052296743192004433)
 }}}

 Both results have the same datatype, which is confusing. As the exact
 method of calculation differs (which might result in the dramatic
 differences in the results for some boundary cases).

 Things are even more confusing when we deal with integer arrays:
 {{{
 a = repeat(2147483647, 100)
 print a.sum()
 -100
 print a.mean()
 2147483647.0
 print a.astype(float32).mean()
 2147483648.0
 }}}

 I believe the convention for the handling of intermediate variables should
 be clearly specified and obeyed.

-- 
Ticket URL: <http://scipy.org/scipy/numpy/ticket/624>
NumPy <http://projects.scipy.org/scipy/numpy>
The fundamental package needed for scientific computing with Python.


More information about the Numpy-tickets mailing list