[IPython-User] Issues running IPython parallel in MPI mode
Fri Jan 27 16:15:18 CST 2012
The issue is resolved now, thanks for the pointer.
The story of what was up:
Your example didn't work and showed that the issue was with MPI and not
I then tried a standard MPI hello world application and that did work.
This pointed the issue to what was between MPI and IPython, mpi4py.
When I first installed mpi4py my system had a not-perfectly-functioning
MPI. I realized this was an issue a couple days ago and installed a fresh
openmpi. This cleared up many issues. Bits of mpi4py unfortunately had been
compiled with the old MPI and this was causing the confusion that I brought
to the list. A quick "pip install mpi4py --upgrade" forced recompilation
with the newer openmpi installation. Everything is now fine. Seeing the
list [3, 2, 0, 1] has never brought me so much joy.
On Fri, Jan 27, 2012 at 3:56 PM, MinRK <firstname.lastname@example.org> wrote:
> On Fri, Jan 27, 2012 at 13:24, Matthew Rocklin <email@example.com> wrote:
> > Hi everyone,
> > I've been playing with using ipython parallel in mpi mode. On my laptop
> > working swell but on a different machine I can't get the ipengines to be
> > part of the same MPI world.
> > I.e. after setting up either an ipcluster or ipcontroller + ipengine I
> > the following in python
> > from IPython.parallel import Client
> > rc = Client() #profile='mpi')
> > view = rc[:]
> > view.execute('from mpi4py import MPI')
> > view.execute('comm = MPI.COMM_WORLD')
> > view.execute('rank = comm.Get_rank()')
> > print view['rank']
> The above can be done a bit more conveniently with apply:
> def get_rank():
> from mpi4py import MPI
> return MPI.COMM_WORLD.Get_rank()
> > and get out a list of zeros, each of the engines is inside of its own mpi
> > world.
> > I set up the cluster either by using
> > ipcluster start --n 4 --engines=MPI --controller=MPI
> Side note: it's rarely useful to use MPI to start the controller - it
> will always be alone in its own MPI universe. Only do this if it's
> required by your batch system/sysadmin somehow.
> > or
> > ipcluster start --n 4 --profile=mpi # config files set up as in the
> > or directly using ipcontroller and ipengines
> > ipcontroller --profile=mpi
> > mpiexec --np 4 ipengine --profile=mpi
> Even this, using mpiexec *directly* doesn't work? I don't see how
> that would even be possible, unless your MPI setup is messed up.
> Try a simple test script:
> import os
> from mpi4py import MPI
> comm = MPI.COMM_WORLD
> rank = comm.Get_rank()
> print "pid:%i, rank:%i\n" % (os.getpid(),rank),
> And run this with `mpiexec -np 4 python test_script.py`.
> > I'm using ubuntu with the ipython that is packaged in the enthought
> > distribution and linking to openmpi.
> > Any thoughts on common mistakes I could be making?
> You are doing the right thing by going straight to mpiexec. Whenever
> ipcluster doesn't work, the first step for debugging is to try to do
> explicitly what you think ipcluster should be doing, and make sure
> that works. ipcluster is convenient when it works, but it's a hassle
> to debug.
> The only thing I can think of is that you might still be running a
> controller/engines and are not actually connecting to your new ones
> started with MPI. This seems unlikely, though, but you can check with
> `ps aux | grep ipengine`.
> > -Matt
> > _______________________________________________
> > IPython-User mailing list
> > IPython-User@scipy.org
> > http://mail.scipy.org/mailman/listinfo/ipython-user
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the IPython-User