[Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?)
Travis E. Oliphant
oliphant@enthought....
Sat Mar 22 13:20:30 CDT 2008
Charles R Harris wrote:
>
>
> On Sat, Mar 22, 2008 at 11:43 AM, Neal Becker <ndbecker2@gmail.com
> <mailto:ndbecker2@gmail.com>> wrote:
>
> James Philbin wrote:
>
> > Personally, I think that the time would be better spent optimizing
> > routines for single-threaded code and relying on BLAS and LAPACK
> > libraries to use multiple cores for more complex calculations. In
> > particular, doing some basic loop unrolling and SSE versions of the
> > ufuncs would be beneficial. I have some experience writing SSE code
> > using intrinsics and would be happy to give it a shot if people tell
> > me what functions I should focus on.
> >
> > James
>
> gcc keeps advancing autovectorization. Is manual vectorization
> worth the
> trouble?
>
>
> The inner loop of a unary ufunc looks like
>
> /*UFUNC_API*/
> static void
> PyUFunc_d_d(char **args, intp *dimensions, intp *steps, void *func)
> {
> intp i;
> char *ip1=args[0], *op=args[1];
> for(i=0; i<*dimensions; i++, ip1+=steps[0], op+=steps[1]) {
> *(double *)op = ((DoubleUnaryFunc *)func)(*(double *)ip1);
> }
> }
>
>
> While it might help the compiler to put the steps on the stack as
> constants, it is hard to see how the compiler could vectorize the loop
> given the information available and the fact that the input data might
> not be aligned or contiguous. I suppose one could make a small local
> buffer, copy the data into it, and then use sse, and that might
> actually help for some things. But it is also likely that the function
> itself won't deal gracefully with vectorized data.
I think the thing to do is to special-case the code so that if the
strides work for vectorization, then a different bit of code is executed
and this current code is used as the final special-case.
Something like this would be relatively straightforward, if a bit
tedious, to do.
-Travis
More information about the Numpy-discussion
mailing list