[Numpy-discussion] parallel compilation of numpy

David Cournapeau david@ar.media.kyoto-u.ac...
Wed Feb 18 23:28:35 CST 2009


Michael Abshoff wrote:
> David Cournapeau wrote:
>   
>> Christian Heimes wrote:
>>     
>>> David Cournapeau wrote:
>>>       
>
> Hi,
>
>   
>>> You may call me naive and ignorant. Is it really that hard to archive
>>> some kind of poor man's concurrency? You don't have to parallelize
>>> everything to get a speed up on multi core machines. Usually the compile
>>> process from C/C++ file to an object files takes up most of the time.
>>>
>>> How about
>>>
>>> * assemble a list of all C/C++ source files of all extensions.
>>> * compile all source files in parallel
>>> * do the rest (linking etc.) in serial
>>>   
>>>       
>
> With Sage we do the cythonization in parallel and for now build 
> extension serially, but we have code to do that in parallel, too. Given 
> that we are building 180 extensions or so the speedup is linear. I often 
> do this using 24 cores, so it seems robust since I do work on Sage daily 
> and often to test builds from scratch and I never had any problems with 
> that code.
>   

Note that building from scratch is the easy case, specially in the case
of parallel builds. Also, I would guess "cythonizing" is easy, at least
if it is done entirely in python. Races conditions in subprocess are a
real problem, it caused numerous issues in scons and waf, so I would be
really surprised if it did not caused any trouble in distutils.
Particularly, on windows, subprocess up to python 2.4 was problematic, I
believe (I should really check, because I was not involved in the
related discussions nor with the fixes in scons).

> To taunt Ondrej: A one minute build isn't forever - numpy is tiny and I 
> understand why it might seem long compared to SymPy, but just wait until 
> you add Cython extensions per default and those build times will go up 
> substantially
>   

Building scipy installer on windows takes 1 hour, which is already
relatively significant. But really, parallel builds is just a nice
consequence of using a sane build tool. I simply cannot stand distutils
anymore; it now feels even more painful than developing on windows.
Every time you touch something, something else, totally unrelated breaks.

cheers,

David


More information about the Numpy-discussion mailing list