[Numpy-discussion] Python ctypes and OpenMP mystery

Francesc Alted faltet@pytables....
Thu Feb 17 02:58:14 CST 2011


A Thursday 17 February 2011 02:24:33 Eric Carlson escrigué:
> Hello Francesc,
> The problem appears to related to my lack of optimization in the
> compilation. If I use
> 
> gcc -O3 -c my_lib.c -fPIC -fopenmp -ffast-math
> 
> 
> the C executable and ctypes/python versions behave almost
> identically.

Ahh, good to know.

> Getting decent behavior takes some thought, though, far
> from the incredible almost-automatic behavior of numexpr.

numexpr uses a very simple method for distributing load among the 
threads, so I suppose this is why it is fast.  The drawback is that 
numexpr only can be used for operations implying the same index (i.e. 
like a+b**3, but not for things like a[i+1]+b[i]**3).  For other 
operations openmp is probably the best option (I should say the 
*easiest* option) right now.

> Now I've got to figure out how to scale up a bunch of vector
> adds/multiplies. Neither numexpr or openmp get you very far with a
> bunch of "z=a*x+b*y"-type calcs.

For these sort of computations you are most probably hitting the memory 
bandwidth wall, so you are out of luck (at least until processors will 
be fast enough to allow compression to actually reduce the time spent in 
computations).

Cheers,

-- 
Francesc Alted


More information about the NumPy-Discussion mailing list