[NumPy-Tickets] [NumPy] #1514: unique and NaN entries

NumPy Trac numpy-tickets@scipy....
Fri Jun 18 16:14:29 CDT 2010

#1514: unique and NaN entries
 Reporter:  rspringuel  |       Owner:  somebody
     Type:  defect      |      Status:  new     
 Priority:  normal      |   Milestone:  2.0.0   
Component:  Other       |     Version:  1.4.0   
 Keywords:              |  
 When unique operates on an array with multiple NaN entries its return
 includes a NaN for each entry that was NaN in the original array.

 a = random.randint(5,size=100).astype(float)
 >>> a[12] = nan #add a single nan entry
 >>> unique(a)
 array([  0.,   1.,   2.,   3.,   4.,  NaN])
 >>> a[20] = nan #add a second
 >>> unique(a)
 array([  0.,   1.,   2.,   3.,   4.,  NaN,  NaN])
 >>> a[13] = nan
 >>> unique(a) #and a third
 array([  0.,   1.,   2.,   3.,   4.,  NaN,  NaN,  NaN])

 This is probably due to the fact that x == y evaluates to False if both x
 and y are NaN.  Unique needs to have "or (isnan(x) and isnan(y))" added to
 the conditional that checks for the presence of a value in the already
 identified values.  I don't know were unique lives in numpy and couldn't
 find it when I went looking, so I can't make the change myself (or even be
 sure what the exact syntax of the conditional should be).

 Also, the following function can be used to patch over the behavior.

 def nanunique(x):
     a = numpy.unique(x)
     r = []
     for i in a:
         if i in r or (numpy.isnan(i) and numpy.any(numpy.isnan(r))):
     return numpy.array(r)

Ticket URL: <http://projects.scipy.org/numpy/ticket/1514>
NumPy <http://projects.scipy.org/numpy>
My example project

More information about the NumPy-Tickets mailing list