[Numpy-discussion] OT: performance in C extension; OpenMP, or SSE ?

Sebastian Haase seb.haase@gmail....
Thu Feb 17 03:39:38 CST 2011

On Thu, Feb 17, 2011 at 10:29 AM, Matthieu Brucher
<matthieu.brucher@gmail.com> wrote:
>> Do you think, one could get even better ?
>> And, where does the 7% slow-down (for single thread) come from ?
>> Is it possible to have the OpenMP option in a code, without _any_
>> penalty for 1 core machines ?
> There will always be a penalty for parallel code that runs on one core. You
> have at least the overhead for splitting the data.
I was referring to when
num_threads=1; // and
is explicitly called.

Then, where does the overhead come from ? --
The call to    omp_set_dynamic(dynamic);
Or the
#pragma omp parallel for private(j, i,ax,ay, dif_x, dif_y)
or some magic done by
gcc ... -fopenmp
(I'm referring to Eric Carlson's earlier in this thread)

I'm wondering if one could have a C "if"-statement, e.g.
if(num_threads == 0)   to then not do any of the omp_xxx() calls.
Obviously, the #pragma would have to be replaceable by some omp_xxx() call first

- Sebastian

More information about the NumPy-Discussion mailing list