[Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?)
Travis E. Oliphant
Sat Mar 22 13:17:01 CDT 2008
James Philbin wrote:
> Personally, I think that the time would be better spent optimizing
> routines for single-threaded code and relying on BLAS and LAPACK
> libraries to use multiple cores for more complex calculations. In
> particular, doing some basic loop unrolling and SSE versions of the
> ufuncs would be beneficial. I have some experience writing SSE code
> using intrinsics and would be happy to give it a shot if people tell
> me what functions I should focus on.
Fabulous! This is on my Project List of todo items for NumPy. See
http://projects.scipy.org/scipy/numpy/wiki/ProjectIdeas I should spend
some time refactoring the ufunc loops so that the templating does not
get in the way of doing this on a case by case basis.
1) You should focus on the math operations: add, subtract, multiply,
divide, and so forth.
2) Then for "combined operations" we should expose the functionality at
a high-level. So, that somebody could write code to take advantage of it.
It would be easiest to use intrinsics which would then work for AMD,
Intel, on multiple compilers.
More information about the Numpy-discussion