[Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?)

David Cournapeau david@ar.media.kyoto-u.ac...
Fri Mar 21 23:53:31 CDT 2008


Anne Archibald wrote:
>
> There was some discussion of this recently. The most direct approach
> to the problem is to annotate some or all of numpy's inner C loops
> with OpenMP constructs, then provide some python functions to control
> the degree of parallelism OpenMP uses. This would transparently
> provide parallelism for many numpy operations, including sum(),
> numpy's version of IDL's total(). All that is needed is for someone to
> implement it. Nobody has stepped forward yet.
I am not really familiar with openMP (only played with it on toy 
problems). From a built point of view, are the problems I could see 
without knowing anything:
    - compiler support: at source code level, open mp works only through 
pragma, right ? So we will get warning for compilers not supporting 
openmp if we just use pragam as is (this could be solved with macro I 
guess).
    - compiler flags and link flags: at least gcc needs flags for 
compilation and linking code with open mp. This means detecting whether 
the compiler supports it.

This does not sound too bad, but this needs to work reliably on all 
supported platforms. Of course, I can add this to numscons; adding it to 
distutils would be a bit more work, but I can do it too if someone else 
is willing to do the actual coding in the C sources.

Now, the main concern I would have is the effectiveness of all this on 
simple operations. I note that matlab 2007a, while claiming support for 
multi-core, does not use multi-core for simple operations, only for FFT, 
BLAS and LAPACK (where this should be possible right now if e.g. using 
Intel MKL, am I right ?). Matlab 7.6 supports also things like 
element-wise computation (a = sin(b))

http://www.mathworks.com/products/matlab/demos.html?file=/products/demos/matlab/multithreadedcomputations/multithreadedcomputations.html

Personally, I am wondering whether it would not be more worthwhile to 
think first about sse and co, because it can give the same order of 
increase in speed, without all the problems linked to multi-threading 
(slower in mono-thread case, in particular).

cheers,

David


More information about the Numpy-discussion mailing list