[Numpy-discussion] Improving Python+MPI import performance

Dag Sverre Seljebotn d.s.seljebotn@astro.uio...
Fri Jan 13 14:21:30 CST 2012


On 01/13/2012 09:19 PM, Dag Sverre Seljebotn wrote:
> On 01/13/2012 02:13 AM, Asher Langton wrote:
>> Hi all,
>>
>> (I originally posted this to the BayPIGgies list, where Fernando Perez
>> suggested I send it to the NumPy list as well. My apologies if you're
>> receiving this email twice.)
>>
>> I work on a Python/C++ scientific code that runs as a number of
>> independent Python processes communicating via MPI. Unfortunately, as
>> some of you may have experienced, module importing does not scale well
>> in Python/MPI applications. For 32k processes on BlueGene/P, importing
>> 100 trivial C-extension modules takes 5.5 hours, compared to 35
>> minutes for all other interpreter loading and initialization. We
>> developed a simple pure-Python module (based on knee.py, a
>> hierarchical import example) that cuts the import time from 5.5 hours
>> to 6 minutes.
>>
>> The code is available here:
>>
>> https://github.com/langton/MPI_Import
>>
>> Usage, implementation details, and limitations are described in a
>> docstring at the beginning of the file (just after the mandatory
>> legalese).
>>
>> I've talked with a few people who've faced the same problem and heard
>> about a variety of approaches, which range from putting all necessary
>> files in one directory to hacking the interpreter itself so it
>> distributes the module-loading over MPI. Last summer, I had a student
>> intern try a few of these approaches. It turned out that the problem
>> wasn't so much the simultaneous module loads, but rather the huge
>> number of failed open() calls (ENOENT) as the interpreter tries to
>> find the module files. In the MPI_Import module, we have rank 0
>> perform the module lookups and then broadcast the locations to the
>> rest of the processes. For our real-world scientific applications
>> written in Python and C++, this has meant that we can start a problem
>> and actually make computational progress before the batch allocation
>> ends.
>
> This is great news! I've forwarded to the mpi4py mailing list which
> despairs over this regularly.
>
> Another idea: Given your diagnostics, wouldn't dumping the output of
> "find" of every path in sys.path to a single text file work well? Then
> each node download that file once and consult it when looking for
> modules, instead of network file metadata.
>
> (In fact I think "texhash" does the same for LaTeX?)
>
> The disadvantage is that one would need to run "update-python-paths"
> every time a package is installed to update the text file. But I'm not
> sure if that that disadvantage is larger than remembering to avoid
> diverging import paths between nodes; hopefully one could put a reminder
> to run update-python-paths in the ImportError string.

I meant "diverging code paths during imports between nodes"..

Dag

>
>
>> If you try out the code, I'd appreciate any feedback you have:
>> performance results, bugfixes/feature-additions, or alternate
>> approaches to solving this problem. Thanks!
>
> I didn't try it myself, but forwarding this from the mpi4py mailing list:
>
> """
> I'm testing it now and actually
> running into some funny errors with unittest on Python 2.7 causing
> infinite recursion.  If anyone is able to get this going, and could
> report successes back to the group, that would be very helpful.
> """
>
> Dag Sverre
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion



More information about the NumPy-Discussion mailing list