[Numpy-discussion] String sort
Charles R Harris
Thu Feb 14 12:03:56 CST 2008
On Thu, Feb 14, 2008 at 10:46 AM, Francesc Altet <firstname.lastname@example.org> wrote:
> A Thursday 14 February 2008, Charles R Harris escrigué:
> > On Thu, Feb 14, 2008 at 9:11 AM, Francesc Altet <email@example.com>
> > > From the plot (attached), it can be drawn the next conclusions:
> > >
> > > 1) copy_string2 (the combination of manual copy and memcpy) is not
> > > better than memcpy for *any* value of the string length in our
> > > Opteron platform. Also, the improvements with Pentium4 was not
> > > that big (<20%). In consequence, I'd choose to always use memcpy
> > > and discard copy_string2.
> > Your copy_string2 is now the version in numpy. I'm hesitant to make
> > memcpy the default for all string lengths because I've read that
> > memcpy was much improved in later gcc (>= 4.1 ?), but known slow in
> > older versions. So perhaps in a year or two when the newer compilers
> > are more widespread would be a better time to make the change. The
> > switch at the 16 char length shouldn't make that much difference in
> > practice. I'll put a comment in the source so that the thought won't
> > get lost.
> Well, copy_string2 is only marginally slower than memcpy on modern gcc
> compilers and processors, so I presume that this is fine.
> > BTW, using copy_string2 much improved the performance of
> > the string mergesort where a lot of data needs to be copied to the
> > work array. It's now half as fast as quicksort instead of 1/3 ;) Heap
> > sort continues in it's traditional slot as the slowest of all. Slow
> > but safe.
> I've seen that you have added specific code for merge and heap sorting
> algorithms for strings. Looks good.
> > > 2) Curiously enough, the indirect sort in Opterons is *always*
> > > faster than newqsort+memcpy. For large values of string lengths (>
> > > 256), the speed-up can be up to 3x, which is a lot. And I'd say
> > > that this keeps true also for most modern processors (read Core2,
> > > Barcelona). For older processors (Pentium4), the indirect method
> > > can be slower than direct plot for small lengths, but by a very few
> > > extent (<10%).
> > The new indirect quicksort for strings is faster than the old qsort
> > based default, so perhaps that is also making a difference.
> Yes, indeed it does!
> > > Conclusion 2 makes me wonder if it wouldn't be useful the
> > > introduction of a new flag in sort, like:
> > >
> > > """
> > > `indirect` - Use the indirect method for sorting. This requires
> > > more memory (4/8 bytes per array element), but for sorting arrays
> > > of strings it is almost always faster than the direct approach
> > > (default). Beware: this is not the case when using numerical
> > > values, where the use of this method for sorting is not
> > > recommendable.
> > > """
> > I'm more inclined to leave this to the user. I have a todo to add a
> > function to numpy that makes it easier to use the argsort output to
> > sort multidimensional arrays, I'll name it argtake or some such and
> > it will use the argsort output along with an axis argument. It won't
> > be quite as memory efficient for multidimensional arrays, but it
> > should work about the same in the 1D case.
> OK. I don't completely grasp what are you trying to do here, but seems
> like a conservative enough path (in the sense that it won't touch the
> current parameters of existing sorting methods).
> Looking forward to see the new qsort for strings in NumPy (the specific
> version for merge sort is very welcome too!).
I could never figure out what the merge sort was good for. I did the
indirect version in numarray because I needed a stable sort to implement
lexsort, which was my original aim. I just added the direct version for
completeness. If you have a use for it, I would love to hear it.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Numpy-discussion