[Numpy-discussion] Fwd: GPU Numpy
Sat Aug 22 04:50:28 CDT 2009
Erik Tollerud skrev:
>> NumPy arrays on the GPU memory is an easy task. But then I would have to
>> write the computation in OpenCL's dialect of C99?
> This is true to some extent, but also probably difficult to do given
> the fact that paralellizable algorithms are generally more difficult
> to formulate in striaghtforward ways.
Then you have misunderstood me completely. Creating an ndarray that has
a buffer in graphics memory is not too difficult, given that graphics
memory can be memory mapped. This has nothing to do with parallelizable
algorithms or not. It is just memory management. We could make an
ndarray subclass that quickly puts is content in a buffer accessible to
the GPU. That is not difficult. But then comes the question of what you
do with it.
I think many here misunderstands the issue here:
Teraflops peak performance of modern GPUs is impressive. But NumPy
cannot easily benefit from that. In fact, there is little or nothing to
gain from optimising in that end. In order for a GPU to help,
computation must be the time-limiting factor. It is not. There is not
more to say about using GPUs in NumPy right now.
Take a look at the timings here: http://www.scipy.org/PerformancePython
It shows that computing with NumPy is more than ten times slower than
using plain C. This is despite NumPy being written in C. The NumPy code
does not incur 10 times more floating point operations than the C code.
The floating point unit does not run in turtle mode when using NumPy.
NumPy's relative slowness compared to C has nothing to do with floating
point computation. It is due to inferior memory use (temporary buffers,
multiple buffer traversals) and memory access being slow. Moving
computation to the GPU can only make this worse.
Improved memory usage - e.g. through lazy evaluation and JIT compilaton
of expressions - can give up to a tenfold increase in performance. That
is where we must start optimising to get a faster NumPy. Incidentally,
this will also make it easier to leverage on modern GPUs.
More information about the NumPy-Discussion