[NumPy-Tickets] [NumPy] #2126: Possibly unwanted behaviour of numpy.median when the array contains numpy.nan

NumPy Trac numpy-tickets@scipy....
Fri May 4 03:24:52 CDT 2012


#2126: Possibly unwanted behaviour of numpy.median when the array contains
numpy.nan
------------------------+---------------------------------------------------
 Reporter:  koji        |       Owner:  somebody   
     Type:  defect      |      Status:  new        
 Priority:  high        |   Milestone:  Unscheduled
Component:  numpy.core  |     Version:  1.5.1      
 Keywords:              |  
------------------------+---------------------------------------------------
 Because the median function is dependent on the sort function, which
 places the nan entries at the end, the median function may overestimate
 median in an unfortunate situation.

 I got really surprised to see that line 38 returned 11.5. I was expecting
 either np.nan or 11.0.

 Perhaps an explicit handling of np.nan (either take away from the sorting
 to begin with) would be better, or make it return np.nan when there's one
 ore more nan's in the array. It makes me wonder if anybody had tripped
 over this without realising it.

 In [33]: np.sort(np.array([np.nan, 10]))
 Out[33]: array([ 10.,  nan])

 In [34]: np.sort(np.array([np.nan, 10, 11]))
 Out[34]: array([ 10.,  11.,  nan])

 In [35]: np.sort(np.array([np.nan, 10, 11, 12]))
 Out[35]: array([ 10.,  11.,  12.,  nan])

 In [36]: np.median(np.array([np.nan, 10]))
 Out[36]: nan

 In [37]: np.median(np.array([np.nan, 10, 11]))
 Out[37]: 11.0

 In [38]: np.median(np.array([np.nan, 10, 11, 12]))
 Out[38]: 11.5


 In [39]: np.__version__
 Out[39]: '1.5.1'

 Python 2.7.1 |EPD 7.0-1 (32-bit)| (r271:86832, Dec  3 2010, 15:41:32)

-- 
Ticket URL: <http://projects.scipy.org/numpy/ticket/2126>
NumPy <http://projects.scipy.org/numpy>
My example project


More information about the NumPy-Tickets mailing list