[Numpy-discussion] String sort

Francesc Altet faltet@carabos....
Thu Feb 14 02:27:26 CST 2008


A Wednesday 13 February 2008, Scott Ransom escrigué:
> On Wednesday 13 February 2008 02:37:37 pm Francesc Altet wrote:
> > So, I'd say that the guilty is the gcc 4.2.1, 64-bit (or at very
> > least, AMD Opteron architecture) and that newqsort performs really
> > well in general (provided that the compiler can find the best path
> > for optimizing its code).  Anyone using a 64-bit platform and
> > having both gcc 4.1.2 and 4.2.1 installed can confirm this?
>
> Here are results from a 64-bit Debian system using a Core2 Duo 2.66
> GHz processor.
>
> I used gcc 3.4.6, 4.1.3, 4.2.3, and 4.3.0 (20080202 experimental)
> with -O2 and -O3.
>
> Summary:  There is a big difference between -02 and -O3.  gcc-4.2
> seems slightly better than the other gccs.  And the newqsort is a lot
> faster (always) than the libc version.
>
> Scott
>
> eiger:/data1$ ./sort346_O2
> Benchmark with 1000000 strings of size 15
> C qsort with C style compare: 0.550000
> C qsort with Python style compare: 0.530000
> NumPy newqsort: 0.450000
>
> eiger:/data1$ ./sort346_O3
> Benchmark with 1000000 strings of size 15
> C qsort with C style compare: 0.550000
> C qsort with Python style compare: 0.520000
> NumPy newqsort: 0.350000
>
> eiger:/data1$ ./sort413_O2
> Benchmark with 1000000 strings of size 15
> C qsort with C style compare: 0.560000
> C qsort with Python style compare: 0.530000
> NumPy newqsort: 0.420000
>
> eiger:/data1$ ./sort413_O3
> Benchmark with 1000000 strings of size 15
> C qsort with C style compare: 0.540000
> C qsort with Python style compare: 0.500000
> NumPy newqsort: 0.280000
>
> eiger:/data1$ ./sort423_O2
> Benchmark with 1000000 strings of size 15
> C qsort with C style compare: 0.560000
> C qsort with Python style compare: 0.530000
> NumPy newqsort: 0.390000
>
> eiger:/data1$ ./sort423_O3
> Benchmark with 1000000 strings of size 15
> C qsort with C style compare: 0.530000
> C qsort with Python style compare: 0.500000
> NumPy newqsort: 0.270000
>
> eiger:/data1$ ./sort43_O2
> Benchmark with 1000000 strings of size 15
> C qsort with C style compare: 0.550000
> C qsort with Python style compare: 0.530000
> NumPy newqsort: 0.340000
>
> eiger:/data1$ ./sort43_O3
> Benchmark with 1000000 strings of size 15
> C qsort with C style compare: 0.530000
> C qsort with Python style compare: 0.510000
> NumPy newqsort: 0.330000

Thanks Scott.  Your input is very valuable, as it seems to confirm that 
the problem must be on gcc 4.2.1 on 64-bit (or Opteron architecture at 
very least) because apparently your gcc 4.2.3 is doing very well.  It's 
a pity that I don't have a 4.2.3 available in our SuSe/Opteron machine 
so as to check if the optimization flaw disappears.  But it seems to me 
that the problem could be specific of 4.2.1, and apparently the GCC 
crew has fixed the problem in 4.2.3, which is a relief.

In any case, if anybody have access to an Opteron machine and gcc 4.2.3, 
it would be great if he can run the benchmark and contribute his 
feedback.

Cheers,

-- 
>0,0<   Francesc Altet     http://www.carabos.com/
V   V   Cárabos Coop. V.   Enjoy Data
 "-"


More information about the Numpy-discussion mailing list