[IPython-User] Load balancing IPython.parallel using MPI?
Jon Olav Vik
Wed Aug 29 09:07:15 CDT 2012
I am using IPython.parallel for load-balanced, fault-tolerant computation on a
shared cluster running a PBS batch system. However, when I scale into hundreds
of engines it seems that the number of TCP connections becomes a limiting
factor, to the extent of slowing down the job submission system and bothering
other users. System admins tell me the cluster is really intended for MPI-based
jobs. I am not transferring large messages, only workpiece indexes, but it
seems the sheer number of messages is the problem.
Can IPython.parallel use MPI instead of TCP for its communication between
ipcontroller and ipengine?
Previously, I ran into a "too many open files" limitation on another cluster:
On this one, the ulimit is higher, so I'm not hitting a hard limit, but instead
I seem to be competing with the batch system's monitoring of nodes.
Googling for IPython and MPI, what I've found mostly seems to be about
*explicitly* using MPI for message passing, e.g.:
What I would like instead is to take my existing program and just switch the
transport of ZMQ messages over to MPI, just by changing the configuration of
IPython. Is that possible?
An example program is given below -- ideally, I would like to get away with at
most one mention of "MPI" in the code 8-)
Thanks in advance,
"""Distributed, load-balanced computation with IPython.parallel."""
from IPython.parallel import Client
c = Client()
lv = c.load_balanced_view()
def busywait(workpiece_id, seconds):
"""Task to be run on engines. Return i and process id."""
# Imports that will be executed on the client.
t0 = time.time()
while (time.time() - t0) < seconds:
return workpiece_id, seconds, os.getpid()
workpiece_id = range(15)
seconds = [2 + i % 5 for i in workpiece_id] # percent sign means 'modulus'
async = busywait.map(workpiece_id, seconds)
More information about the IPython-User