[Numpy-discussion] Numpy and OpenMP
Mon Mar 17 13:34:05 CDT 2008
A Monday 17 March 2008, Christopher Barker escrigué:
> > > Plus a certain amount of numpy code depends on order of
> > > evaluation:
> > >
> > > a[:-1] = 2*a[1:]
> I'm confused here. My understanding of how it now works is that the
> above translates to:
> 1) create a new array (call it temp1) from a[1:], which shares a's
> data block.
> 2) create a temp2 array by multiplying 2 times each of the elements
> in temp1, and writing them into a new array, with a new data block 3)
> copy that temporary array into a[:-1]
> Why couldn't step (2) be parallelized? Why isn't it already with,
> BLAS? Doesn't BLAS must have such simple routines?
Probably yes, but the problem is that this kind of operations, namely,
vector-to-vector (usually found in the BLAS1 subset of BLAS), are
normally memory-bounded, so you can take little avantage from using
BLAS, most specially in modern processors, where the gap between the
CPU throughput and the memory bandwith is quite high (and increasing).
In modern machines, the use of BLAS is more interesting in vector-matrix
(BLAS2) computations, but definitely is in matrix-matrix (BLAS3) ones
(which is where the oportunities for cache reuse is higher) where the
speedups can really be very good.
> Also, maybe numexpr could benefit from this?
Maybe, but unfortunately it wouldn't be able to achieve high speedups.
Right now, numexpr is focused in accelerating mainly vector-vector
operations (or matrix-matrix, but element-wise, much like NumPy, so
that the cache cannot be reused), with some smart optimizations for
strided and unaligned arrays (in this scenario, it can be 2x or 3x
faster than NumPy, even for very simple operations like 'a+b').
In a similar way, OpenMP (or whatever parallel paradigm) will only
generally be useful when you have to deal with lots of data, and your
algorithm can have the oportunity to structure it so that small
portions of them can be reused many times.
>0,0< Francesc Altet http://www.carabos.com/
V V Cárabos Coop. V. Enjoy Data
More information about the Numpy-discussion