[Numpy-discussion] Accelerating NumPy computations [Was: GPU Numpy]
Fri Aug 21 17:06:50 CDT 2009
Matthew Brett, on 2009-08-21 11:51, wrote:
> > Indeed. In the future, if OpenCL is the way to go, it may even be
> > helpful to have Numpy using OpenCL directly, as AMD provides an SDK
> > for OpenCL, and with Larrabee approaching, Intel will surely provide
> > one of its own.
> I was just in a lecture by one of the Intel people about OpenCL:
> He offered no schedule for an Intel OpenCL implementation, but said
> that they were committed to it.
> The lectures in general were effective in pointing out what a
> time-consuming effort it can be moving algorithms into the the
> parallel world - including GPUs. The lecture just passed cited the
> example of a CUDA-based BLAS implementation on the GPU that was slower
> than the CPU version. Making BLAS go faster required a lot of work
> to find optimal strategies for blocking, transfer between CPU / GPU
> shared memory / GPU registers, vector sizes and so on - this on a
> specific NVIDIA architecture.
> I can imagine Numpy being useful for scripting in this
> C-and-assembler-centric world, making it easier to write automated
> testers, or even generate C code.
> Is anyone out there working on this kind of stuff? I ask only because
> there seems to be considerable interest here on the Berkeley campus.
This is exactly the sort of thing you can do with PyCUDA, which makes it
In particular, see the metaprogramming portion of the docs:
The metaprogramming section of the slides and source code from Nicolas
Pinto and Andreas Klöckner *excellent* SciPy2009 Tutorials is even more thorough:
More information about the NumPy-Discussion