[IPython-User] Issues running IPython parallel in MPI mode

Matthew Rocklin mrocklin@gmail....
Fri Jan 27 16:15:18 CST 2012


The issue is resolved now, thanks for the pointer.

The story of what was up:
Your example didn't work and showed that the issue was with MPI and not
IPython related.

I then tried a standard MPI hello world application and that did work.

This pointed the issue to what was between MPI and IPython, mpi4py.

When I first installed mpi4py my system had a not-perfectly-functioning
MPI. I realized this was an issue a couple days ago and installed a fresh
openmpi. This cleared up many issues. Bits of mpi4py unfortunately had been
compiled with the old MPI and this was causing the confusion that I brought
to the list. A quick "pip install mpi4py --upgrade" forced recompilation
with the newer openmpi installation. Everything is now fine. Seeing the
list [3, 2, 0, 1] has never brought me so much joy.

Thanks,
-Matt

On Fri, Jan 27, 2012 at 3:56 PM, MinRK <benjaminrk@gmail.com> wrote:

> On Fri, Jan 27, 2012 at 13:24, Matthew Rocklin <mrocklin@gmail.com> wrote:
> > Hi everyone,
> >
> > I've been playing with using ipython parallel in mpi mode. On my laptop
> it's
> > working swell but on a different machine I can't get the ipengines to be
> > part of the same MPI world.
> >
> > I.e. after setting up either an ipcluster or ipcontroller + ipengine I
> run
> > the following in python
> >
> > from IPython.parallel import Client
> > rc = Client() #profile='mpi')
> > view = rc[:]
> > view.execute('from mpi4py import MPI')
> > view.execute('comm = MPI.COMM_WORLD')
> > view.execute('rank = comm.Get_rank()')
> > print view['rank']
>
> The above can be done a bit more conveniently with apply:
>
> def get_rank():
>    from mpi4py import MPI
>    return MPI.COMM_WORLD.Get_rank()
> view.apply_sync(get_rank)
>
> >
> > and get out a list of zeros, each of the engines is inside of its own mpi
> > world.
> >
> > I set up the cluster either by using
> > ipcluster start --n 4 --engines=MPI --controller=MPI
>
> Side note: it's rarely useful to use MPI to start the controller - it
> will always be alone in its own MPI universe.  Only do this if it's
> required by your batch system/sysadmin somehow.
>
> > or
> > ipcluster start --n 4 --profile=mpi # config files set up as in the
> tutorial
> >
> > or directly using ipcontroller and ipengines
> >
> > ipcontroller --profile=mpi
> > mpiexec --np 4 ipengine --profile=mpi
>
> Even this, using mpiexec *directly* doesn't work?  I don't see how
> that would even be possible, unless your MPI setup is messed up.
>
> Try a simple test script:
>
> import os
> from mpi4py import MPI
> comm = MPI.COMM_WORLD
>
> rank = comm.Get_rank()
> print "pid:%i, rank:%i\n" % (os.getpid(),rank),
>
> And run this with `mpiexec -np 4 python test_script.py`.
>
> >
> > I'm using ubuntu with the ipython that is packaged in the enthought
> > distribution and linking to openmpi.
> >
> > Any thoughts on common mistakes I could be making?
>
> You are doing the right thing by going straight to mpiexec.  Whenever
> ipcluster doesn't work, the first step for debugging is to try to do
> explicitly what you think ipcluster should be doing, and make sure
> that works.  ipcluster is convenient when it works, but it's a hassle
> to debug.
>
> The only thing I can think of is that you might still be running a
> controller/engines and are not actually connecting to your new ones
> started with MPI.  This seems unlikely, though, but you can check with
> `ps aux | grep ipengine`.
>
> -MinRK
>
> >
> > -Matt
> >
> > _______________________________________________
> > IPython-User mailing list
> > IPython-User@scipy.org
> > http://mail.scipy.org/mailman/listinfo/ipython-user
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/ipython-user/attachments/20120127/a83ada79/attachment-0001.html 


More information about the IPython-User mailing list