[Numpy-discussion] Fwd: GPU Numpy

Citi, Luca lciti@essex.ac...
Thu Sep 10 03:41:10 CDT 2009


Hi Sturla,

> The proper way to speed up "dot(a*b+c*sqrt(d), e)" is to get rid of 
> temporary intermediates.
I implemented a patch 
http://projects.scipy.org/numpy/ticket/1153
that reduces the number of temporary intermediates.
In your example from 4 to 2.
There is a big improvement in terms of memory footprint,
and some improvement in terms of speed (especially for
large matrices) but not as much as I expected.

In your example
> result = 0
> for i in range(n):
>     result += (a[i]*b[i] + c[i]*sqrt(d[i])) * e[i]
another big speedup could come from the fact that it
makes better use of the cache.

That is exactly why numexpr is faster in these cases.
I hope one day numpy will be able to perform such
optimizations.

Best,
Luca


More information about the NumPy-Discussion mailing list