[Numpy-discussion] String sort
Charles R Harris
Wed Feb 13 12:44:05 CST 2008
On Feb 13, 2008 10:56 AM, Francesc Altet <firstname.lastname@example.org> wrote:
> A Wednesday 13 February 2008, Charles R Harris escrigué:
> > OK,
> > The new quicksorts are in svn. Francesc, can you check them out?
> Looks good here. However, you seem to keep using your own copy_string()
> instead of plain memcpy(). In previous benchmarks, I've seen that
> copy_string() is faster than memcpy only for small values of the length
> of the block to be copied.
Yes, I noticed that your benchmark program crossed over to using memcpy at
16 chars, and I will probably add that feature. I was being conservative to
> Finally, you also will have noticed the indirect sort line in the plot.
> This is because I was curious about when this method would win a direct
> sort. And, by looking at the plot, it seems that the crosspoint is
> around strings of 128 bytes (much more in fact that I initially
> thought), and starts to be very significant (around 40% faster) at 256
> bytes. So perhaps it would make sense to add the possibility to choose
> the indirect method when sorting those large strings. This, of course,
> would require more memory for the indices, but using 4 or 8 additional
> bytes (depending if we on 32-bit or 64-bit), when each string takes 200
> bytes, doesn't seem too crazy. In any case, it would be nice to
> document this in docstrings.
It would be easy to add this feature, but for the moment I think the best
thing is to document it.
Another fairly easy change that could be made is to support strided arrays.
That might speed sorting of non-contiguous arrays and sorts on axis other
than -1. The only reason it isn't there now is that I originally wrote the
sorting routines for numarray and numarray's upper level interface passed
contiguous arrays to the sort functions.
> Be warned, I'd like to stress out that these are my figures for my _own
> laptop_. It would be nice if you can verify all of this with other
> achitectures (your Core2 machine seems different enough). I can run
> the benchmarks on Windows (installed in the same laptop) too. Tell me
> if you are interested on me doing this.
Its easy enough to test if you compile from svn, just add your new copy
function and change the name in this line:
to use your function instead of copy_string.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Numpy-discussion