[Numpy-discussion] Improving Python+MPI import performance
Sturla Molden
sturla@molden...
Fri Jan 13 17:28:39 CST 2012
Den 13.01.2012 22:42, skrev Sturla Molden:
> Den 13.01.2012 22:24, skrev Robert Kern:
>> Do these systems have a ramdisk capability?
> I assume you have seen this as well :)
>
> http://www.cs.uoregon.edu/Research/paracomp/papers/iccs11/iccs_paper_final.pdf
>
This paper also repeats a common mistake about the GIL:
"A future challenge is the increasing number of CPU cores per node,
which is normally addressed by hybrid thread and message passing based
parallelization. Whereas message passing can be used transparently by
both on Python and C level, the global interpreter lock in CPython
limits the thread based parallelization to the C-extensions only. We are
currently investigating hybrid OpenMP/MPI implementation with the hope
that limiting threading to only C-extension provides enough performance."
This is NOT true.
Python threads are native OS threads. They can be used for parallel
computing on multi-core CPUs. The only requirement is that the Python
code calls a C extension that releases the GIL. We can use threads in C
or Python code: OpenMP and threading.Thread perform equally well, but if
we use threading.Thread the GIL must be released for parallel execution.
OpenMP is typically better for fine-grained parallelism in C code and
threading.Thread is better for course-grained parallelism in Python
code. The latter is also where mpi4py and multiprocessing can be used.
Sturla
More information about the NumPy-Discussion
mailing list