[IPython-dev] Using IPython Cluster with SGE -- help needed

Abraham D. Flaxman abie@uw....
Tue Sep 10 19:15:23 CDT 2013


Have you had more success with this Andreas?  I've tried to use my SGE today as well, and had moderate success.  I started where you were:

    ipcluster start --profile=sge -n 12  # starts one engine, sometimes

which I can confirm via:
In [48]:  len(p.Client(profile='sge'))
Out[48]: 1

Then I qsub more engines with:

    for i in {1..100}; do qsub sge.engine.template; done  # starts 101 more

and this gives me more remote clients:

In [49]: len(p.Client(profile='sge'))
Out[49]: 101

To shut down the cluster:
qstat | awk {'print $1'} | xargs qdel

Also, it shut down on its own one time, which I appreciated, but perhaps was not intended.

Any tips on how I can make this work more smoothly?

--Abie


Abraham D. Flaxman
Assistant Professor
Institute for Health Metrics and Evaluation | University of Washington
2301 5th Avenue, Suite 600 | Seattle, WA 98121| USA
Tel: +1-206-897-2800 | Fax: +1-206-897-2899 UW
abie@uw.edu | http://healthmetricsandevaluation.org | http://healthyalgorithms.com



-----Original Message-----
From: ipython-dev-bounces@scipy.org [mailto:ipython-dev-bounces@scipy.org] On Behalf Of Andreas Hilboll
Sent: Monday, August 05, 2013 7:19 AM
To: Matthieu Brucher
Cc: IPython developers list
Subject: Re: [IPython-dev] Using IPython Cluster with SGE -- help needed

Thanks, Mathieu,

answers inline:

Am 05.08.2013 16:02, schrieb Matthieu Brucher:
> Hi,
> 
> I don't know why the registration was not complete. Is your home 
> folder the same on all nodes and on the login node?

Yes, it is. Could this be some firewall issue?

> You won't see 12 jobs. You asked for 12 engines, and they will all be 
> submitted in one job and the 12 engines will be started by mpiexec -n 
> 12. This is the standard way of using batch schedulers. Ask for some 
> cores, run an mpi application on these cores.

Well, then I guess our IT department doesn't like "the standard way". We have a multi-node cluster, comprising 12 nodes, one 'management' and 11 'computing' nodes. And we don't have/use mpi usually.

What I would need in order to use our multi-node cluster the way our sysadmins want us to, I'd need to submit a total of {n} ipengines via {n} calls to ``qsub``.

Any idea how I can accomplish this?

Thanks for your help!
Andreas.



More information about the IPython-dev mailing list