[Numpy-discussion] String sort
Charles R Harris
charlesr.harris@gmail....
Fri Feb 8 12:07:20 CST 2008
On Feb 8, 2008 10:31 AM, Francesc Altet <faltet@carabos.com> wrote:
> A Friday 08 February 2008, Francesc Altet escrigué:
> > A Friday 08 February 2008, Charles R Harris escrigué:
> > > > Also, in the context of my work in indexing, and because of the
> > > > slowness of the current implementation in NumPy, I've ended with
> > > > an implementation of the quicksort method for 1-D array strings.
> > > > For moderately large arrays, it is about 2.5x-3x faster than the
> > > > (supposedly) mergesort version in NumPy, not only due to the
> > > > quicksort, but also because I've implemented a couple of macros
> > > > for efficient string swapping and copy. If this is of interest
> > > > for NumPy developers, tell me and I will provide the code.
> > >
> > > I have some code for this too and was going to merge it. Send yours
> > > along and I'll get to it this weekend.
> >
> > Ok, great. I'm attaching it. Tell me if you need some clarification
> > on the code.
>
> Ops. I've introduced a last-minute problem in my code. To fix this,
> just replace the flawed opt_strncmp() that I sent before by:
>
> static int inline opt_strncmp(char *a, char *b, int n) {
> int i;
> for (i=0; i<n; i++) {
> if (a[i] > b[i]) return i+1;
> if (a[i] < b[i]) return -(i+1);
> /* Another way, but seems equivalent in speed, at least here */
> /* if (a[i] != b[i]) */
> /* return (((unsigned char *)a)[i] - ((unsigned char *)b)[i]); */
> }
> return 0;
> }
>
> Apparently, this version works just fine.
>
Did you find this significantly faster than strncmp? There is also a unicode
compare, do you have thoughts about that?
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20080208/354844eb/attachment.html
More information about the Numpy-discussion
mailing list