[Numpy-discussion] unique() should return a sorted array
Robert Kern
robert.kern at gmail.com
Sun Jul 2 05:37:29 CDT 2006
Norbert Nemec wrote:
> I agree.
>
> Currently the order of the output of unique is undefined. Defining it in
> such a way that it produces a sorted array will not break any compatibility.
>
> My idea would be something like
>
> def unique(arr,sort=True):
> if hasattr(arr,'flatten'):
> tmp = arr.flatten()
> tmp.sort()
> idx = concatenate([True],tmp[1:]!=tmp[:-1])
> return tmp[idx]
> else: # for compatibility:
> set = {}
> for item in inseq:
> set[item] = None
> if sort:
> return asarray(sorted(set.keys()))
> else:
> return asarray(set.keys())
>
> Does anybody know about the internals of the python "set"? How is
> .keys() implemented? I somehow have really doubts about the efficiency
> of this method.
Well, that's a dictionary, not a set, but they both use the same algorithm. They
are both hash tables. If you need more specific details about how the hash
tables are implemented, the source (Object/dictobject.c)is the best place for them.
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
More information about the Numpy-discussion
mailing list