[IPython-User] Load balancing IPython.parallel using MPI?

Jon Olav Vik jonovik@gmail....
Wed Aug 29 19:15:55 CDT 2012


MinRK <benjaminrk <at> gmail.com> writes:

> On Wed, Aug 29, 2012 at 7:07 AM, Jon Olav Vik <jonovik <at> gmail.com> wrote:
> 
> I am using IPython.parallel for load-balanced, fault-tolerant computation on a
> shared cluster running a PBS batch system. However, when I scale into hundreds
> of engines it seems that the number of TCP connections becomes a limiting
> factor, to the extent of slowing down the job submission system and bothering
> other users. System admins tell me the cluster is really intended for MPI-
based
> jobs. I am not transferring large messages, only workpiece indexes, but it
> seems the sheer number of messages is the problem.
> 
> How many engines starts to cause trouble?  I don't have access to hardware to 
test this sort of things,
> and would love data points.  I have found that it behaves fairly well at 
least up to 128 engines,
> 
> which is the most I have run.

About 400 have been running fine, at least when I am nearly alone on the 
cluster. (It is a fairly new one, and being tested.) However, the cluster 
sysadmins have complained that the PBS queueing system seems to be slow when 
I'm doing this kind of stuff, hypothesizing that competition for TCP traffic 
may be the reason. In that case, it is possible that problems would arise with 
less than 400 engines if there are many jobs being submitted, or running, or 
heavily using TCP for their own ends.

When things start to clog up, I have seen a lag of several minutes to return 
even small answers from engines. I've tried tweaking the chunksize and hwm 
settings, but just keeping the connections open seems to be causing trouble.

(It would be nice if the high water mark could be changed on a running 
ipcluster -- is that possible?)

> Can IPython.parallel use MPI instead of TCP for its communication between
> ipcontroller and ipengine?
> 
> No, it cannot.  The answer to this will be to shard schedulers across 
multiple machines, which is not yet implemented.  The Client and Engine already 
would have no problem, it's only starting the cloned Schedulers that needs to 
be added.
> 
> The best you can do right now is to limit your IPython jobs, such that 
> you have one IPython cluster per N engines, where N is a reasonable 
> number that you find performs adequately.  Then you would, in your 
> client code, have to distribute work chunks across your P/N clients.

Yes, thank you; I remember discussing this in a previous thread. Using that 
approach, I have had 800 engines working nicely on a different cluster. That 
avoided the user limit on open files, but I would like to know whether it also 
avoids congestion with respect to other users and the batch queue system. In 
addition to spreading traffic over several TCP addresses, I could spread TCP 
traffic over different physical machines by running ipclusters on different 
machines. Do you think that is a relevant option?

The sysadmins on this particular cluster are only moderately enthusiastic about 
what I'm doing, because that cluster is equipped with an expensive, fast 
interconnect that I do not use. My computations are trivially parallel, though 
I very much need load balancing -- and fault-tolerance, if the queue is crowded 
so that short worker jobs have a better chance of fitting into "backfill" gaps.

Myself, I think IPython.parallel could be extremely important in lowering the 
barriers to computing clusters for non-specialists -- if only I found a way to 
deal with this bottleneck.

Thanks for the feedback!

Jon Olav



More information about the IPython-User mailing list