[Scipy-tickets] [SciPy] #1653: scoreatprecentile return wrong values when used on array with NaNs

SciPy Trac scipy-tickets@scipy....
Wed May 2 15:16:11 CDT 2012


#1653: scoreatprecentile return wrong values when used on array with NaNs
-------------------------+--------------------------------------------------
 Reporter:  imrisofer    |       Owner:  somebody   
     Type:  defect       |      Status:  new        
 Priority:  normal       |   Milestone:  Unscheduled
Component:  scipy.stats  |     Version:  0.10.0     
 Keywords:               |  
-------------------------+--------------------------------------------------

Comment(by jseabold):

 The behavior is consistent. As Josef mentioned, it's essentially treating
 the NaNs as very large numbers, though the interpolation in the 3rd case
 results in a nan. Consider this

 {{{
 a1 = np.array([1,2,3,4])
 a2 = np.array([1,2,3,4,1e4])
 a3 = np.array([1,2,3,4,1e4,1e4])

 l = [a1, a2, a3]

 for i in range(3):
     print "array with %d NaNs" %(i)
     print "q25: ", scoreatpercentile(l[i], 25)
     print "q50: ", scoreatpercentile(l[i], 50)
     print "q75: ", scoreatpercentile(l[i], 75)
     print
 }}}

 Outputs

 {{{
 array with 0 NaNs
 q25:  1.75
 q50:  2.5
 q75:  3.25

 array with 1 NaNs
 q25:  2.0
 q50:  3.0
 q75:  4.0

 array with 2 NaNs
 q25:  2.25
 q50:  3.5
 q75:  7501.0
 }}}

 FWIW, they are working on better NaN handling machinery in numpy. Until
 this is settled, I don't know that we'll see blanket explicit nan handling
 in scipy.

-- 
Ticket URL: <http://projects.scipy.org/scipy/ticket/1653#comment:3>
SciPy <http://www.scipy.org>
SciPy is open-source software for mathematics, science, and engineering.


More information about the Scipy-tickets mailing list