[Numpy-discussion] unique() should return a sorted array

Robert Cimrman cimrman3 at ntc.zcu.cz
Mon Jul 3 04:35:15 CDT 2006


Sasha wrote:
> On 7/2/06, Norbert Nemec <Norbert.Nemec.list at gmx.de> wrote:
>> ...
>> Does anybody know about the internals of the python "set"? How is
>> .keys() implemented? I somehow have really doubts about the efficiency
>> of this method.
>>
> Set implementation (Objects/setobject.c) is a copy and paste job from
> dictobject with values removed.  As a result it is heavily optimized
> for the case of string valued keys - a case that is almost irrelevant
> for numpy.
> 
> I think something like the following (untested, 1d only) will probably
> be much faster and sorted:
> 
> def unique(x):
>       s = sort(x)
>       r = empty_like(s)
>       r[:-1] = s[1:]
>       r[-1] = s[0]
>       return s[r != s]

There are 1d array set operations like this already in numpy
(numpy/lib/arraysetops.py - unique1d, ...)

r.





More information about the Numpy-discussion mailing list