[Numpy-discussion] unique rows of array

josef.pktd@gmai... josef.pktd@gmai...
Tue Aug 18 00:25:24 CDT 2009


On Tue, Aug 18, 2009 at 1:03 AM, <josef.pktd@gmail.com> wrote:
> On Tue, Aug 18, 2009 at 12:59 AM, Maria Liukis<liukis@usc.edu> wrote:
>>
>> On Aug 17, 2009, at 9:51 PM, Charles R Harris wrote:
>>
>>
>> On Mon, Aug 17, 2009 at 10:30 PM, Maria Liukis <liukis@usc.edu> wrote:
>>>
>>> Hello everybody,
>>> While re-implementing some Matlab code in Python, I've run into a problem
>>> of finding a NumPy function analogous to the Matlab's "unique(array,
>>> 'rows')" to get unique rows of an array. Searching the web, I've found a
>>> similar discussion from couple of years ago with an example:
>>
>> Just to be clear, do you mean finding all rows that only occur once in the
>> array?
>>
>> Yes.
>
> I interpreted your question as removing duplicates. It keeps rows that
> occur more than once.
> That's what my example is intended to do.
>
> Josef
>
>>
>> <snip>
>>
>> Chuck
>>

Just a reminder about views on views, I don't think the recommendation
to take the transpose to get unique columns works.
We had the discussion some time ago, that views work on the original
array data and not on the view, and in this case the transpose creates
a view.  example below

Also, unique does a sort and doesn't preserve order.

Josef


>>> c=np.array([[ 10,  1,  2],
       [ 3,  4,  5],
       [ 3,  4,  5],
       [ 9, 10, 11]])
>>> cc = c.copy() #backup
>>> c = cc.T
>>> cc
array([[10,  1,  2],
       [ 3,  4,  5],
       [ 3,  4,  5],
       [ 9, 10, 11]])
>>> np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1])
Traceback (most recent call last):
  File "<pyshell#46>", line 1, in <module>
    np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1])
ValueError: new type not compatible with array.


>>> c = cc.T.copy()
>>> c
array([[10,  3,  3,  9],
       [ 1,  4,  4, 10],
       [ 2,  5,  5, 11]])
>>> np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1])
array([[ 1,  4,  4, 10],
       [ 2,  5,  5, 11],
       [10,  3,  3,  9]])
>>> c = np.ascontiguousarray(cc.T)
>>> np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1])
array([[ 1,  4,  4, 10],
       [ 2,  5,  5, 11],
       [10,  3,  3,  9]])


More information about the NumPy-Discussion mailing list