# [NumPy-Tickets] [NumPy] #1514: unique and NaN entries

NumPy Trac numpy-tickets@scipy....
Fri Jun 18 16:14:29 CDT 2010

```#1514: unique and NaN entries
------------------------+---------------------------------------------------
Reporter:  rspringuel  |       Owner:  somebody
Type:  defect      |      Status:  new
Priority:  normal      |   Milestone:  2.0.0
Component:  Other       |     Version:  1.4.0
Keywords:              |
------------------------+---------------------------------------------------
When unique operates on an array with multiple NaN entries its return
includes a NaN for each entry that was NaN in the original array.

Examples:
a = random.randint(5,size=100).astype(float)
>>> a[12] = nan #add a single nan entry
>>> unique(a)
array([  0.,   1.,   2.,   3.,   4.,  NaN])
>>> a[20] = nan #add a second
>>> unique(a)
array([  0.,   1.,   2.,   3.,   4.,  NaN,  NaN])
>>> a[13] = nan
>>> unique(a) #and a third
array([  0.,   1.,   2.,   3.,   4.,  NaN,  NaN,  NaN])

This is probably due to the fact that x == y evaluates to False if both x
and y are NaN.  Unique needs to have "or (isnan(x) and isnan(y))" added to
the conditional that checks for the presence of a value in the already
identified values.  I don't know were unique lives in numpy and couldn't
find it when I went looking, so I can't make the change myself (or even be
sure what the exact syntax of the conditional should be).

Also, the following function can be used to patch over the behavior.

def nanunique(x):
a = numpy.unique(x)
r = []
for i in a:
if i in r or (numpy.isnan(i) and numpy.any(numpy.isnan(r))):
continue
else:
r.append(i)
return numpy.array(r)

--
Ticket URL: <http://projects.scipy.org/numpy/ticket/1514>
NumPy <http://projects.scipy.org/numpy>
My example project
```

More information about the NumPy-Tickets mailing list