[Numpy-tickets] [NumPy] #924: problem with summing large array of float32

NumPy numpy-tickets@scipy....
Fri Oct 3 12:11:57 CDT 2008


#924: problem with summing large array of float32
-------------------------------+--------------------------------------------
 Reporter:  emil               |        Owner:  somebody
     Type:  defect             |       Status:  reopened
 Priority:  high               |    Milestone:          
Component:  numpy.core         |      Version:  1.0.1   
 Severity:  critical           |   Resolution:          
 Keywords:  sum, dot, float32  |  
-------------------------------+--------------------------------------------
Changes (by emil):

  * status:  closed => reopened
  * resolution:  wontfix =>

Comment:

 Thanks for clarifying the issue, I should have realized that it was round-
 off error
 especially after I went to fortran.

 But I still would ask that a change be made so that accumulators for sum,
 dot, or
 other similar functions by default be float64 (note that dot doesn't seem
 to have the option
 to change the type of the accumulator).

 Here's why:
 For medical imaging, we use large arrays of single-precision to save
 space.
 These large arrays are not sparse, and each entry has similar values
 (for example x-ray attenuation coefficients will not vary greatly over a
 volume).
 There are processing operations used in computer-aided diagnosis that
 involve
 summing the array or dotting the array with another. The error introduced
 by having a single-precision accumulator can be large as I found out. As a
 user
 of a high-level package such as matlab or numpy, one generally doesn't
 expect
 this kind of error with summing over values that do not alternate in sign.
 I have checked with my colleagues on the floor, and nobody suspected this
 problem.
 Although they understand it, when I explain what is happening. Some are
 wondering
 now if they have an error in previous work.

 The above example with the sum result being 110880003:
 If the sum result is 110880003, and I get 110880000, I'm happy, because
 the answer I got is reasonable for float32.

 My above example is extreme. Summing over a float32 array (840,2200,60) of
 ones
 yields 16,777,216 instead of 110,880,000 . This answer is way off, so I
 would probably
 suspect something is wrong.

 The problem, however, is that this type of error can lead to computational
 errors
 on the order of a few percent, which is inaccurate enough to cause
 problems, but
 not inaccurate enough for the problem to be easily detected.
 In fact, the situation were I caught the problem, the sum result was only
 off by 4% .

-- 
Ticket URL: <http://scipy.org/scipy/numpy/ticket/924#comment:2>
NumPy <http://projects.scipy.org/scipy/numpy>
The fundamental package needed for scientific computing with Python.


More information about the Numpy-tickets mailing list