[IPython-user] Time taken to push. Adjusting MPI parameter for processor affinity. No effect.

Brian Granger ellisonbg.net@gmail....
Wed Nov 5 10:48:04 CST 2008


You are comparing apples (IPython.kernel) and oranges (PyPar+MPI).
PyPar uses MPI which is a peer to peer message passing architecture.
IPython uses a completely different architecture and doesn't use MPI
for its implementation.  Thus, it _should_ be slower than MPI.  I will
explain a bit more to clarify things...

* The IPythons architecture is not peer to peer.  Instead, all
processes connect to a central process (the IPython controller) which
manages the computation.  This architecture is required when you want
to do things interactively and be able to disconnect/reconnect to a
running parallel job.  Here is what things look like in IPython

|              \
|               \
Engine     Engine

All of these connections are handled using regular TCP sockets.

* We would like to optimize push/pull and other operations in IPython.
 At the same time, if you are sending really large objects from the
client to the engines, the best solution is for you to re-work your
algorithm to avoid this data transfer.  The same is true of using
PyPar+MPI.  Here are some tips:

1.  Build the large matrices on the engines in the first place.  The
only thing you should send to the engines are i) the info required to
build the matrix and ii) the code required to build the matrix.

2.  If the matrices are being read from data on disk, do that on the
engines in parallel rather than in the client.

3.  If you need to send data efficiently between engines during a
computation, use MPI.  IPython fully integrates with MPI (I can give
you more info on how this works if you want).  Also, the best MPI
implementation for IPython (by far) is Mpi4Py (mpi4py.scipy.org).

* Think of how long it takes to download a 500 MB file.  500 MB is a
lot of data no matter how you handle it.  I should be slow.  Granted,
IPython currently is not optimized to handle such large objects, but
even if it were, it would still be slow.

* Can you outline/describe the algorithm you are using?  I am more
than willing to help you figure out a way of optimizing it using



On Sat, Nov 1, 2008 at 3:05 AM, mark starnes <m.starnes05@imperial.ac.uk> wrote:
> Hi everyone,
> A short update.  This multi-processor, shared memory machine shows no performance change
> with the code snippet above, when I set 'processor affinty' to 1.  I got the tip from,
> http://www.open-mpi.org/faq/?category=tuning
> and remain interested in the results anyone else gets / tips for improving push times.
> Best regards,
> Mark.
> _______________________________________________
> IPython-user mailing list
> IPython-user@scipy.org
> http://lists.ipython.scipy.org/mailman/listinfo/ipython-user

More information about the IPython-user mailing list