[IPython-User] Issues running IPython parallel in MPI mode

MinRK benjaminrk@gmail....
Fri Jan 27 15:56:14 CST 2012


On Fri, Jan 27, 2012 at 13:24, Matthew Rocklin <mrocklin@gmail.com> wrote:
> Hi everyone,
>
> I've been playing with using ipython parallel in mpi mode. On my laptop it's
> working swell but on a different machine I can't get the ipengines to be
> part of the same MPI world.
>
> I.e. after setting up either an ipcluster or ipcontroller + ipengine I run
> the following in python
>
> from IPython.parallel import Client
> rc = Client() #profile='mpi')
> view = rc[:]
> view.execute('from mpi4py import MPI')
> view.execute('comm = MPI.COMM_WORLD')
> view.execute('rank = comm.Get_rank()')
> print view['rank']

The above can be done a bit more conveniently with apply:

def get_rank():
    from mpi4py import MPI
    return MPI.COMM_WORLD.Get_rank()
view.apply_sync(get_rank)

>
> and get out a list of zeros, each of the engines is inside of its own mpi
> world.
>
> I set up the cluster either by using
> ipcluster start --n 4 --engines=MPI --controller=MPI

Side note: it's rarely useful to use MPI to start the controller - it
will always be alone in its own MPI universe.  Only do this if it's
required by your batch system/sysadmin somehow.

> or
> ipcluster start --n 4 --profile=mpi # config files set up as in the tutorial
>
> or directly using ipcontroller and ipengines
>
> ipcontroller --profile=mpi
> mpiexec --np 4 ipengine --profile=mpi

Even this, using mpiexec *directly* doesn't work?  I don't see how
that would even be possible, unless your MPI setup is messed up.

Try a simple test script:

import os
from mpi4py import MPI
comm = MPI.COMM_WORLD

rank = comm.Get_rank()
print "pid:%i, rank:%i\n" % (os.getpid(),rank),

And run this with `mpiexec -np 4 python test_script.py`.

>
> I'm using ubuntu with the ipython that is packaged in the enthought
> distribution and linking to openmpi.
>
> Any thoughts on common mistakes I could be making?

You are doing the right thing by going straight to mpiexec.  Whenever
ipcluster doesn't work, the first step for debugging is to try to do
explicitly what you think ipcluster should be doing, and make sure
that works.  ipcluster is convenient when it works, but it's a hassle
to debug.

The only thing I can think of is that you might still be running a
controller/engines and are not actually connecting to your new ones
started with MPI.  This seems unlikely, though, but you can check with
`ps aux | grep ipengine`.

-MinRK

>
> -Matt
>
> _______________________________________________
> IPython-User mailing list
> IPython-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-user
>


More information about the IPython-User mailing list