[Numpy-discussion] Python ctypes and OpenMP mystery
Francesc Alted
faltet@pytables....
Thu Feb 17 02:58:14 CST 2011
A Thursday 17 February 2011 02:24:33 Eric Carlson escrigué:
> Hello Francesc,
> The problem appears to related to my lack of optimization in the
> compilation. If I use
>
> gcc -O3 -c my_lib.c -fPIC -fopenmp -ffast-math
>
>
> the C executable and ctypes/python versions behave almost
> identically.
Ahh, good to know.
> Getting decent behavior takes some thought, though, far
> from the incredible almost-automatic behavior of numexpr.
numexpr uses a very simple method for distributing load among the
threads, so I suppose this is why it is fast. The drawback is that
numexpr only can be used for operations implying the same index (i.e.
like a+b**3, but not for things like a[i+1]+b[i]**3). For other
operations openmp is probably the best option (I should say the
*easiest* option) right now.
> Now I've got to figure out how to scale up a bunch of vector
> adds/multiplies. Neither numexpr or openmp get you very far with a
> bunch of "z=a*x+b*y"-type calcs.
For these sort of computations you are most probably hitting the memory
bandwidth wall, so you are out of luck (at least until processors will
be fast enough to allow compression to actually reduce the time spent in
computations).
Cheers,
--
Francesc Alted
More information about the NumPy-Discussion
mailing list