[Numpy-discussion] String sort
Thu Feb 14 12:44:55 CST 2008
A Thursday 14 February 2008, Charles R Harris escrigué:
> On Thu, Feb 14, 2008 at 10:46 AM, Francesc Altet <firstname.lastname@example.org>
> > Looking forward to see the new qsort for strings in NumPy (the
> > specific version for merge sort is very welcome too!).
> I could never figure out what the merge sort was good for. I did the
> indirect version in numarray because I needed a stable sort to
> implement lexsort, which was my original aim. I just added the direct
> version for completeness. If you have a use for it, I would love to
> hear it.
Well, I must to confess that I've not used merge sorts yet, but I'd like
to test them in the context of my PSI (Partially Sorted Indexes, see
 for a white paper on a concrete implementation) work. My hope is
that, as a merge sort keeps the order of indices of elements that are
equal (this is what 'stable' means), this would allow better
compression rates for indices (and hence, less I/O effort to bring the
indices from disk into memory and ultimately allowing for faster lookup
speed). This will probably be only important when one have data
distributions with rather low cardinality, but these scenarios can be
more frequent/important than one may think.
>0,0< Francesc Altet http://www.carabos.com/
V V Cárabos Coop. V. Enjoy Data
More information about the Numpy-discussion