[Numpy-discussion] Using multiprocessing (shared memory) with numpy array multiplication
Wed Jun 15 12:38:45 CDT 2011
Perhaps it is time to write somthing in the SciPy cookbook about
parallel computing with NumPy? It seems to be certain problems that are
discussed again and again. These are some issues that come to mind (I'm
sure there is more):
- The difference between I/O bound, memory bound, and CPU bound work.
- Why NumPy code is usually memory bound, and what that means.
- The problem with false-sharing in cache lines (including Python refcounts)
- What the GIL is and what it's not (real information instead of FUD)
- Linear algebra with optimized BLAS and LAPACK libraries.
- Parallel FFTs (FFTW, MKL, ACML)
- Parallel PRNGs (and algorithmic pitfalls)
- Autovectorizing Fortran compilers
- OpenMP with C, C++ or Fortran (and using it from Python)
- Python threads and releasing the GIL
- Python threads in Cython
- native threads in Cython
- multiprocessing with ordinary NumPy arrays
- multiprocessing with shared memory
- MPI with Python (mpi4py)
- os.fork and copy-on-write memory (including the problem with Python
- Using GPUs with Python, including ACML-GPU, PyOpenCL and PyCUDA.
More information about the NumPy-Discussion