[Numpy-discussion] String sort

Francesc Altet faltet@carabos....
Tue Feb 12 10:07:13 CST 2008


A Monday 11 February 2008, Charles R Harris escrigué:
> I'll check it out when I get home. As I say, it was running about 10%
> slower on my machine, but if it does better on most platforms it is
> probably the way to go. We can always change it in the future when
> everyone is running on quantum computers.

We've done some testing on newqsort in several computers in our company.  
Here are the results for ordering a list with 1 million of strings of 
length 15 filled with random information (using C rand()):

1) Ubuntu 7.1 (gcc 4.1.3, -O3, Intel Pentium 4 @ 2 GHz)
C qsort with C style compare: 2.450000
C qsort with Python style compare: 2.440000
NumPy newqsort: 0.650000

2) Windows XP (SP2) (MSVC 7.1, /Ox, Intel Pentium 4 @ 2 GHz)
C qsort with C style compare: 0.971000
C qsort with Python style compare: 0.962000
NumPy newqsort: 0.921000

3) SuSe LE 10.3 (gcc 4.2.1, -O3, AMD Opteron @ 2 GHz)
C qsort with C style compare: 0.640000
C qsort with Python style compare: 0.600000
NumPy newqsort: 0.590000

4) Debian 4.2.2 (lenny) (gcc 4.2.3, -O3, Intel Pentium 4 @ 3.2 GHz)
C qsort with C style compare: 1.770000
C qsort with Python style compare: 1.750000
NumPy newqsort: 0.440000

5) Mandriva 2008.0 (gcc 4.2.2, -O3, Intel Core2 Duo @ 1.5 GHz)
C qsort with C style compare: 1.590000
C qsort with Python style compare: 1.550000
NumPy newqsort: 0.510000

6) Ubuntu 7.1 (gcc 4.1.3, -O3, Intel Pentium 4 @ 2.5 GHz)
C qsort with C style compare: 1.890000
C qsort with Python style compare: 1.900000
NumPy newqsort: 0.500000

7) Ubuntu 7.1 (gcc 4.1.2, -O3, PowerPC 3 @ 1.3 GHz)
C qsort with C style compare: 3.030000
C qsort with Python style compare: 2.970000
NumPy newqsort: 1.040000

8) MacOSX 10.4 (Tiger) (gcc 4.0.1, -O3, PowerPC 3 @ 1.3 GHz)
C qsort with C style compare: 1.560000
C qsort with Python style compare: 1.510000
NumPy newqsort: 1.220000

All benchmarks have been run using the attached benchmark (if anybody 
else wants to join the fiesta, please report back your feedback).

Summarizing, one can say a couple of things:

* Recent Debian distros and derivatives (Ubuntu) as well as Mandriva are 
suffering from a innefficient system qsort (at least the implementation 
for strings).  SuSe Linux Enterprise 10.3 seems to have solved this.  
And Windows XP (SP2) and MacOSX (Tiger) looks like they have a 
relatively efficient implementation of qsort.

* The newqsort performs the best on all the platforms we have checked 
(ranging from a 5% of improvement on Opteron/SuSe, up to 3.8x with some 
Pentium4/Ubuntu systems).

All in all, I'd also say that newqsort would be a good candidate to be 
put into NumPy.

Cheers,

-- 
>0,0<   Francesc Altet     http://www.carabos.com/
V   V   Cárabos Coop. V.   Enjoy Data
 "-"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sort-string-bench.c
Type: text/x-csrc
Size: 5299 bytes
Desc: not available
Url : http://projects.scipy.org/pipermail/numpy-discussion/attachments/20080212/f5f75f10/attachment.bin 


More information about the Numpy-discussion mailing list