[Numpy-discussion] Fwd: GPU Numpy
Dag Sverre Seljebotn
Wed Sep 9 10:34:03 CDT 2009
Christopher Barker wrote:
> George Dahl wrote:
>> Sturla Molden <sturla <at> molden.no> writes:
>>> Teraflops peak performance of modern GPUs is impressive. But NumPy
>>> cannot easily benefit from that.
>> I know that for my work, I can get around an order of a 50-fold speedup over
>> numpy using a python wrapper for a simple GPU matrix class.
> I think you're talking across each other here. Sturla is referring to
> making a numpy ndarray gpu-aware and then expecting expressions like:
> z = a*x**2 + b*x + c
> to go faster when s, b, c, and x are ndarrays.
> That's not going to happen.
> On the other hand, George is talking about moving higher-level
> operations (like a matrix product) over to GPU code. This is analogous
> to numpy.linalg and numpy.dot() using LAPACK routines, and yes, that
> could help those programs that use such operations.
> So a GPU LAPACK would be nice.
> This is also analogous to using SWIG, or ctypes or cython or weave, or
> ??? to move a computationally expensive part of the code over to C.
> I think anything that makes it easier to write little bits of your code
> for the GPU would be pretty cool -- a GPU-aware Cython?
Cython is probably open for that if anybody's interested in implementing
it/make a student project on it (way too big for GSoC I think,
However I'd definitely make it a generic library turning expressions
into compiled code (either GPU or CPU w/SSE); that could then be used
both at compile-time from Cython, or at run-time using e.g. SymPy or
SAGE expressions. Both PyCUDA and CorePy would tend to allow both
compile-time operation and run-time operation.
More information about the NumPy-Discussion