[Numpy-discussion] parallel compilation of numpy
Wed Feb 18 19:57:27 CST 2009
On Thu, Feb 19, 2009 at 9:14 AM, Ondrej Certik <firstname.lastname@example.org> wrote:
> I have a shiny new computer with 8 cores and numpy still takes forever
> to compile
Forever ? It takes one minute to build :) scipy takes for ever, but it
is because of C++ more than anything else.
> --- is there a way to compile it in parallel (make -j9)?
> Do distutils allow that?
No, and it never will. Parallel builds requires to build with
dependency handling. Even make does not handle it well: it works most
of the time by accident, but there are numerous problems (try for
example building lapack with make -j8 on your 8 cores machine - it
will give a bogus library 90 % of the time, because it starts building
static library with ar while some object files are still built).
scons does handle it well. Now, I don't think it will make building
numpy that much faster. Here are the numbers:
- numscons build, no parallel job: 1m08
- distutils build: 1m17
- numscons build, 4 jobs (on a two cores machine): 54s
Now, those numbers are on Mac OS X, and this is the worse platform to
try it on, because gcc already uses two cores even when using one
command line (I cannot confirm this, but looking at the activity
monitor, gcc always use both cores when building universal apps, which
There are more fundamental reasons, though:
- for numpy, a lot of time is spent in configuration: configuration
cannot be done with multiple jobs
- numscons has one fundamental limitation: it launched a new scons
subprocess for every subpackage, so you can't parallelize different
subpackages at the same time (says numpy.core and numpy.random).
This limitation is very hard to circumvent IMHO, and is the main
limitation of numscons ATM:
- it means no-op builds are very slow (20s for scipy, for example),
because I have to relaunch scons for every subpackage, and scons is
relatively slow to start (and I spent quite some time to optimize this
already: but 30 subpackages in scipy, with scons taking 0.2 s only to
start means already 6 seconds to do nothing)
- it means parallel is not as good as it can be
- it means python setup.py sdist is very hard to to support (I have
not yet found a way - it is broken ATM if you remove the distutils
scripts - the problem is that)
- it means numscons wastes its time doing many times the same checks
If I (or you :) ) solve this, I would be in favor of using numscons.
Yesterday, I wasted half a day with distutils to do something which
takes 2 minutes in numscons. Going aways from this would be a big
relief, at least as far as I am concerned.
But it is very hard - I think it would be at least one week of full
work (because it is a scons limitation).
More information about the Numpy-discussion