[Numpy-discussion] String sort

Francesc Altet faltet@carabos....
Wed Feb 13 13:37:37 CST 2008


A Wednesday 13 February 2008, Francesc Altet escrigué:
> A Wednesday 13 February 2008, Bruce Southey escrigué:
> > Hi,
> > I added gcc 4.2 from the openSUSE 10.1 repository so I now have
> > both the 4.1.2 and 4.2.1 compilers installed. But still have
> > glibc-2.4-31.1 installed. I see your result with 4.2.1 but not with
> > 4.1.2 so I think that there could be a difference in the compiler
> > flags. I don't know enough about those to help but I can test any
> > suggestions.
> >
> > $ gcc --version
> > gcc (GCC) 4.1.2 20070115 (prerelease) (SUSE Linux)
> > $ gcc -O3 sort-string-bench.c -o sort412
> > $ ./sort412
> > Benchmark with 1000000 strings of size 15
> > C qsort with C style compare: 0.630000
> > C qsort with Python style compare: 0.640000
> > NumPy newqsort: 0.360000
> >
> > $ gcc-4.2 --version
> > gcc-4.2 (GCC) 4.2.1 (SUSE Linux)
> > $ gcc-4.2 -O3 sort-string-bench.c -o sort421
> > $ ./sort421
> > Benchmark with 1000000 strings of size 15
> > C qsort with C style compare: 0.620000
> > C qsort with Python style compare: 0.610000
> > NumPy newqsort: 0.550000
> >
> > This is  the same as:
> > $ gcc-4.2 -O2 -finline-functions sort-string-bench.c -o sort421
> > $ ./sort421
> > Benchmark with 1000000 strings of size 15
> > C qsort with C style compare: 0.710000
> > C qsort with Python style compare: 0.700000
> > NumPy newqsort: 0.550000
> >
> > (NumPy newqsort with -O2 alone is 0.60000)
> >
> > For completeness, 4.1.2 using '-O2' versus '-O2 -finline-functions'
> > is NumPy newqsort: 0.620000 vs NumPy newqsort: 0.500000
>
> That's really interesting.  Let me remember my figures for our
> Opteron:
>
> 3) SuSe LE 10.3 (gcc 4.2.1, -O3, AMD Opteron @ 2 GHz)
> C qsort with C style compare: 0.640000
> C qsort with Python style compare: 0.600000
> NumPy newqsort: 0.590000
>

Just an addedum.  I've compiled the benchmark using gcc 4.1.2 using our 
Opteron machine.  Here are the results:

SuSe LE 10.3 (gcc 4.1.2, -O3, AMD Opteron @ 2 GHz)
Benchmark with 1000000 strings of size 15
C qsort with C style compare: 0.620000
C qsort with Python style compare: 0.610000
NumPy newqsort: 0.380000

So, I'm getting a 55% more of performance than by using gcc 4.2.1 (!).
Also, I've installed gcc 4.2.1 on my laptop and here are the results:

Ubuntu 7.10 (gcc 4.2.1, -O3, Intel Pentium 4 @ 2 GHz)
Benchmark with 1000000 strings of size 15
C qsort with C style compare: 2.450000
C qsort with Python style compare: 2.420000
NumPy newqsort: 0.630000

While using gcc 4.1.2, I get:

Ubuntu 7.10 (gcc 4.1.3, -O3, Intel Pentium 4 @ 2 GHz)
Benchmark with 1000000 strings of size 15
C qsort with C style compare: 2.450000
C qsort with Python style compare: 2.440000
NumPy newqsort: 0.650000

So, in this case (32-bit platform) gcc 4.2.1 seems to perform similarly 
to 4.1.2.

So, I'd say that the guilty is the gcc 4.2.1, 64-bit (or at very least, 
AMD Opteron architecture) and that newqsort performs really well in 
general (provided that the compiler can find the best path for 
optimizing its code).  Anyone using a 64-bit platform and having both 
gcc 4.1.2 and 4.2.1 installed can confirm this?

Also, MSVC 7.1 32-bit (with opt level /Ox) doesn't seem to find such a 
best path (the benchmark for newqsort takes 0.92s using MSVC 7.1, while 
gcc 4.1.2 takes 0.65s using the same machine, a 40% faster).  I don't 
know whether newer versions of MSVC will do better or not, though.

Cheers,

-- 
>0,0<   Francesc Altet     http://www.carabos.com/
V   V   Cárabos Coop. V.   Enjoy Data
 "-"


More information about the Numpy-discussion mailing list