[Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?)

Thomas Grill grrrr.org@gmail....
Sat Mar 22 19:06:54 CDT 2008


Hi,
here's my results:

Intel Core 2 Duo, 2.16GHz, 667MHz bus, 4MB Cache
running under OSX 10.5.2

please note that the auto-vectorizer of gcc-4.3 is doing really well....

gr~~~

---------------------

gcc version 4.0.1 (Apple Inc. build 5465)

xbook-2:temp thomas$ gcc -msse -O2 vec_bench.c -o vec_bench
xbook-2:temp thomas$ ./vec_bench
Testing methods...
All OK

        Problem size              Simple              Intrin              Inline
                 100   0.0002ms (100.0%)   0.0001ms ( 83.2%)   0.0001ms ( 85.1%)
                1000   0.0014ms (100.0%)   0.0014ms ( 99.5%)   0.0014ms ( 97.6%)
               10000   0.0180ms (100.0%)   0.0137ms ( 76.1%)   0.0103ms ( 56.9%)
              100000   0.1307ms (100.0%)   0.1153ms ( 88.2%)   0.0952ms ( 72.8%)
             1000000   4.0309ms (100.0%)   4.1641ms (103.3%)   4.0129ms ( 99.6%)
            10000000  43.2557ms (100.0%)  43.5919ms (100.8%)  42.6391ms ( 98.6%)



gcc version 4.3.0 20080125 (experimental) (GCC)

xbook-2:temp thomas$ gcc-4.3 -msse -O2 vec_bench.c -o vec_bench
xbook-2:temp thomas$ ./vec_bench
Testing methods...
All OK

        Problem size              Simple              Intrin              Inline
                 100   0.0002ms (100.0%)   0.0001ms ( 77.4%)   0.0001ms ( 72.0%)
                1000   0.0017ms (100.0%)   0.0014ms ( 84.4%)   0.0014ms ( 79.4%)
               10000   0.0173ms (100.0%)   0.0148ms ( 85.4%)   0.0104ms ( 59.9%)
              100000   0.1276ms (100.0%)   0.1243ms ( 97.4%)   0.0952ms ( 74.6%)
             1000000   4.0466ms (100.0%)   4.1168ms (101.7%)   4.0348ms ( 99.7%)
            10000000  43.1842ms (100.0%)  43.2989ms (100.3%)  44.2171ms (102.4%)

xbook-2:temp thomas$ gcc-4.3 -msse -O2 -ftree-vectorize vec_bench.c -o vec_bench
xbook-2:temp thomas$ ./vec_bench
Testing methods...
All OK

        Problem size              Simple              Intrin              Inline
                 100   0.0001ms (100.0%)   0.0001ms (126.6%)   0.0001ms (120.3%)
                1000   0.0011ms (100.0%)   0.0014ms (136.3%)   0.0014ms (127.9%)
               10000   0.0144ms (100.0%)   0.0153ms (106.3%)   0.0103ms ( 72.0%)
              100000   0.1027ms (100.0%)   0.1243ms (121.0%)   0.0953ms ( 92.8%)
             1000000   3.9691ms (100.0%)   4.1197ms (103.8%)   4.0252ms (101.4%)
            10000000  42.1922ms (100.0%)  43.6721ms (103.5%)  43.4035ms (102.9%)


More information about the Numpy-discussion mailing list