[Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?)

James Philbin philbinj@gmail....
Sun Mar 23 13:22:48 CDT 2008

OK, i'm really impressed with the improvements in vectorization for
gcc 4.3. It really seems like it's able to work with real loops which
wasn't the case with 4.1. I think Chuck's right that we should simply
special case contiguous data and allow the auto-vectorizer to do the
rest. Something like this for the ufuncs:

 /**begin repeat

   #OP=||, +*13, ^, -*13#
   #kind=add*14, subtract*14#
   #typ=(Bool, byte, ubyte, short, ushort, int, uint, long, ulong,
longlong, ulonglong, float, double, longdouble)*2#

static void
@TYPE@_@kind@_contig(@typ@ *i1, @typ@ *i2, @type@ *op, int n)
   int i;
   for (i=0; i<n; i++) {
      op[i] = i1[i] @OP@ i2[i];

static void
@TYPE@_@kind@(char **args, intp *dimensions, intp *steps, void *func)
    register intp i;
    intp is1=steps[0],is2=steps[1],os=steps[2], n=dimensions[0];
    char *i1=args[0], *i2=args[1], *op=args[2];

    if (is1==1 && is2==1 && os==1)
        return @TYPE@_@kind@_contig((@typ@ *)i1, (@typ@ *)i2, (@typ@ *)os, n);

    for(i=0; i<n; i++, i1+=is1, i2+=is2, op+=os) {
        *((@typ@ *)op)=*((@typ@ *)i1) @OP@ *((@typ@ *)i2);
/**end repeat**/

We also need to add -ftree-vectorize to the standard compile flags at
least for the ufuncs.


More information about the Numpy-discussion mailing list