[IPython-User] Limitations (?) of ipython SGE support

Brian Granger ellisonbg@gmail....
Mon Jan 17 23:33:47 CST 2011


> Thanks for your reply. It seems that interactive use is your main goal.
> However, wouldn't out be better to submit every task separately using qsub
> (instead of submitting ipengines)?

There are a couple of reasons for doing this (just one long task):

* Our scheduler has much lower lantency and overhead than that of SGE.
* The ipython engines have persistent namespaces.  Thus each task can
read/write from/to that namespace and subsequent tasks will be able to
see those changes.  This is a huge diference if you need to do some
lengthy initialization before doing the tasks.  Keeping things in
memory is a huge benefit.

But,

For really long running tasks (long enough to run into the queue time
limits), using ipython doesn't make a lot of sense.  But this is a
usage case we should be able to cover better.  Will have to think
about that at some point.

Cheers,

Brian

> Thanks,
> Chris
>
> On Jan 17, 2011 2:59 AM, "Brian Granger" <ellisonbg@gmail.com> wrote:
>> Hi,
>>
>> On Thu, Jan 13, 2011 at 7:41 AM, Chris Filo Gorgolewski
>> <chris.gorgolewski@gmail.com> wrote:
>>> Hi,
>>> I have recently played with ipython on our SGE cluster. I was
>>> surprised to discover that ipython does not use qsub to submit every
>>> job, but submits prespecified number of ipengines as jobs. Those I
>>> presume run indefinitely and accept ipython tasks. This setup seem to
>>> have two major drawbacks:
>>
>> Yes, this is a correct description of what happens.
>>
>>> 1) my cluster have nodes with different max job time. Depending what
>>> you specify in the qsub option the job gets send to different node.
>>> The limit is 48h. This means that after 48h (assuming that I use a
>>> custom submit script with this option) all of my engines will be
>>> killed and ipython will stop receiving jobs?
>>
>> Yes, that is right.
>>
>>> In other words I cannot
>>> run a set of jobs that would run longer than two days using ipython?
>>
>> Yep, there is no way of getting around the limitations/constraints of
>> the queues.
>>
>>> Additionally if I decide to specify max job time 48h I will most
>>> likely wait longer for the appropriate nodes to become free which is
>>> not really necessary when my atomic jobs run much faster.
>>
>> Yep, such is life on shared clusters with batch system :(
>>
>> What about just firing up an EC2 cluster using startcluster?
>>
>>> 2) I need to specify how many engines I want to use. Assuming i want
>>> my set of jobs to be done as quickly as possible I should specify a
>>> number that would be bigger than the number of available nodes. This
>>> means that in many situations I will spawn way too many ipengines that
>>> will just sit there doing nothing. This solution seems to lack
>>> scalability.
>>>
>>> Or maybe I am using ipython/SGE in a wrong way?
>>
>> From what we have said, I think you are using ipython/sge in the right
>> manner, you are just running into the fact that batch systems are not
>> setup for truly interactive usage.
>>
>> Cheers,
>>
>> Brian
>>
>>> Best regards,
>>> Chris
>>> _______________________________________________
>>> IPython-User mailing list
>>> IPython-User@scipy.org
>>> http://mail.scipy.org/mailman/listinfo/ipython-user
>>>
>>
>>
>>
>> --
>> Brian E. Granger, Ph.D.
>> Assistant Professor of Physics
>> Cal Poly State University, San Luis Obispo
>> bgranger@calpoly.edu
>> ellisonbg@gmail.com
>



-- 
Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu
ellisonbg@gmail.com


More information about the IPython-User mailing list