[Numpy-discussion] String sort
Charles R Harris
Sat Feb 9 15:55:52 CST 2008
On Feb 9, 2008 2:42 PM, Charles R Harris <email@example.com> wrote:
> On Feb 9, 2008 2:29 PM, Francesc Altet <firstname.lastname@example.org> wrote:
> > Chuck,
> > One more thing on this. I've been doing some benchmarking with my
> > opt_memcpy() macro in the quicksort_string function, and I should say
> > that while it is definitely more efficient than my system memcpy for
> > small values of n (the number of bytes to copy), this doesn't keep true
> > for all values of n. For example, for n<16, opt_memcpy() can be more
> > than 4x faster than system memcpy (and this is why I naively thought
> > that it would be faster in general). However, for n>80, memcpy beats
> > opt_memcpy between a 25% and 100% (depending on whether n is divisible
> > by 2, 4 or 8). This is on my Linux system (Ubuntu 7.10), but perhaps
> > with Windows the behaviour can be different.
> > I think I would be able to come up with a routine that can offer a
> > balance between opt_memcpy and system memcpy, but that should take some
> > time. So, until I (or anybody else) do more research on this, I think
> > it would be safer if you use system memcpy for string sorting in NumPy.
> The memcpy in newer compilers is actually pretty good. For integers and
> such it sometime compiles inline using integer assignments, but I was loath
> to make it the default implementation until >= 4.1.x gcc became more
> common. However, strings might be a good place to use it.
I'm also thinking that at some point it becomes more efficient to do a
indirect sort followed by take than to move all those big strings around.
But I guess we won't know where that point is until we have both versions
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Numpy-discussion