[IPython-User] Limitations (?) of ipython SGE support
Mon Jan 17 23:33:47 CST 2011
> Thanks for your reply. It seems that interactive use is your main goal.
> However, wouldn't out be better to submit every task separately using qsub
> (instead of submitting ipengines)?
There are a couple of reasons for doing this (just one long task):
* Our scheduler has much lower lantency and overhead than that of SGE.
* The ipython engines have persistent namespaces. Thus each task can
read/write from/to that namespace and subsequent tasks will be able to
see those changes. This is a huge diference if you need to do some
lengthy initialization before doing the tasks. Keeping things in
memory is a huge benefit.
For really long running tasks (long enough to run into the queue time
limits), using ipython doesn't make a lot of sense. But this is a
usage case we should be able to cover better. Will have to think
about that at some point.
> On Jan 17, 2011 2:59 AM, "Brian Granger" <email@example.com> wrote:
>> On Thu, Jan 13, 2011 at 7:41 AM, Chris Filo Gorgolewski
>> <firstname.lastname@example.org> wrote:
>>> I have recently played with ipython on our SGE cluster. I was
>>> surprised to discover that ipython does not use qsub to submit every
>>> job, but submits prespecified number of ipengines as jobs. Those I
>>> presume run indefinitely and accept ipython tasks. This setup seem to
>>> have two major drawbacks:
>> Yes, this is a correct description of what happens.
>>> 1) my cluster have nodes with different max job time. Depending what
>>> you specify in the qsub option the job gets send to different node.
>>> The limit is 48h. This means that after 48h (assuming that I use a
>>> custom submit script with this option) all of my engines will be
>>> killed and ipython will stop receiving jobs?
>> Yes, that is right.
>>> In other words I cannot
>>> run a set of jobs that would run longer than two days using ipython?
>> Yep, there is no way of getting around the limitations/constraints of
>> the queues.
>>> Additionally if I decide to specify max job time 48h I will most
>>> likely wait longer for the appropriate nodes to become free which is
>>> not really necessary when my atomic jobs run much faster.
>> Yep, such is life on shared clusters with batch system :(
>> What about just firing up an EC2 cluster using startcluster?
>>> 2) I need to specify how many engines I want to use. Assuming i want
>>> my set of jobs to be done as quickly as possible I should specify a
>>> number that would be bigger than the number of available nodes. This
>>> means that in many situations I will spawn way too many ipengines that
>>> will just sit there doing nothing. This solution seems to lack
>>> Or maybe I am using ipython/SGE in a wrong way?
>> From what we have said, I think you are using ipython/sge in the right
>> manner, you are just running into the fact that batch systems are not
>> setup for truly interactive usage.
>>> Best regards,
>>> IPython-User mailing list
>> Brian E. Granger, Ph.D.
>> Assistant Professor of Physics
>> Cal Poly State University, San Luis Obispo
Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
More information about the IPython-User