[Numpy-discussion] Fast threading solution thoughts
Thu Feb 12 14:15:17 CST 2009
Wow, interesting thread. Thanks everyone for the ideas. A few more comments:
* Even though there is a bottleneck between main memory and GPU
memory, as Nathan mentioned, the much larger memory bandwidth on a GPU
often makes GPUs great for memory bound computations...as long as you
can leave your data on the GPU for most of the computation. In my
case I can do this and this is something I am pursuing as well.
I don't really like OpenMP (pragma?!?), but it would be very nice if
Cython had optional support for OpenMP that didn't use comments.
What I would really like is a nice, super fast *library* built on top
of pthreads that made it possible to do OpenMP-like things in Cython,
but without depending on having an OpenMP compiler. Basically a
fancy, fast thread pool implementation in Cython.
And a question:
With the new Numpy support in Cython, does Cython release the GIL if
it can when running through through loops over numpy arrays? Does
Cython call into the C API during these sections?
More information about the Numpy-discussion