[Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?)
Mon Mar 24 09:38:43 CDT 2008
A couple of thoughts on parallelism:
1. Can someone come up with a small set of cases and time them on
numpy, IDL, Matlab, and C, using various parallel schemes, for each of
a representative set of architectures? We're comparing a benchmark to
itself on different architectures, rather than seeing whether the
thread capability is helping our competition on the same architecture.
If it's mostly not helping them, we can forget it for the time being.
I suspect that it is, in fact, helping them, or at least not hurting
2. Would it slow things much to have some state that the routines
check before deciding whether to run a parallel implementation or not?
It could default to single thread except in the cases where
parallelism always helps, but the user can configure it to multithread
beyond certain threshholds of, say, number of elements. Then, in the
short term, a savvy user can tweak that state to get parallelism for
more than N elements. In the longer term, there could be a test
routine that would run on install and configure the state for that
particular machine. When numpy started it would read the saved file
and computation would be optimized for that machine. The user could
always override it.
3. We should remember the first rule of parallel programming, which
Anne quotes as "premature optimization is the root of all evil".
There is a lot to fix in numpy that is more fundamental than speed. I
am the first to want things fast (I would love my secondary eclipse
analysis to run in less than a week), but we have gaping holes in
documentation and other areas that one would expect to have been
filled before a 1.0 release. I hope we can get them filled for 1.1.
It bears repeating that our main resource shortage is in person-hours,
and we'll get more of those as the community grows. Right now our
deficit in documentation is hurting us badly, while our deficit in
parallelism is not. There is no faster way of growing the community
than making it trivial to learn how to use numpy without hand-holding
from an experienced user. Let's explore parallelism to assess when
and how it might be right to do it, but let's stay focussed on the
fundamentals until we have those nailed.
More information about the Numpy-discussion