[IPython-User] [IPython-user] IPython LSF support
eklavyaa
eklavyaa@gmail....
Wed Sep 7 21:12:10 CDT 2011
It seems to work with the following edits to the configuration
The config files were created using
ipython profile create --parallel --profile=lsf
ipcluster_config.py edits:
c.IPClusterStart.controller_launcher_class = 'LSFControllerLauncher'
c.IPClusterStart.engine_launcher_class = 'LSFEngineSetLauncher'
ipcontroller_config.py edits:
c.HubFactory.ip = '*'
I was able to start the controller and engine using
ipcluster start --n=2 --profile=lsf
However, the engines would occasionally time-out. Here are the logs of two
engines from the same session
$ cat ipengine.e.8694278
[IPEngineApp] Using existing profile dir:
u'/home/unix/username/.ipython/profile_lsf'
[IPEngineApp] Loading url_file
u'/home/unix/shsingh/.ipython/profile_lsf/security/ipcontroller-engine.json'
[IPEngineApp] Registering with controller at tcp://X.X.X.X:36875
[IPEngineApp] Completed registration with id 0
[IPEngineApp] Engine Interrupted, shutting down...
$ cat ipengine.e.8694300
[IPEngineApp] Using existing profile dir:
u'/home/unix/username/.ipython/profile_lsf'
[IPEngineApp] Loading url_file
u'/home/unix/shsingh/.ipython/profile_lsf/security/ipcontroller-engine.json'
[IPEngineApp] Registering with controller at tcp://X.X.X.X:36875
[IPEngineApp] Registration timed out after 2.0 seconds
The first one was able to register, while the second one was not and would
thus time-out.
My guess is that it is something to do with the fact that the engines might
be getting allocated to a node before the controller, but can't verify this.
Note that this time-out issue happens most, but not all, the time.
Any ideas on what might be going wrong?
-E
MinRK wrote:
>
> Yes, 0.11 has LSF support. Just use the
> LSFEngineSetLauncher/LSFControllerLauncher,
> the same as you would with PBS or SGE.
>
> It is basically untested, as the author of the LSF launchers is the only
> one
> to have tested it, to our knowledge, so please let us know about
> shortcomings, etc.. I should scan through the docs to make sure they
> aren't
> out of sync.
>
> -MinRK
>
> On Wed, Aug 31, 2011 at 14:45, eklavyaa <eklavyaa@gmail.com> wrote:
>
>>
>> Hi all,
>>
>> I am interested in using IPython's parallel computing modules with an LSF
>> scheduler.
>>
>> I looked through the documentation and the forums and was unable to
>> figure
>> out whether LSF support exists in the current version.
>>
>> Here are two somewhat conflicting pieces of information in this context :
>>
>>
>> http://ipython.org/ipython-doc/dev/whatsnew/version0.10.html says
>>
>> "...The only significant new feature is that IPython’s parallel computing
>> machinery now supports natively the Sun Grid Engine and LSF schedulers."
>>
>>
>> http://ipython.org/ipython-doc/dev/development/roadmap.html says
>>
>> "...We need to add support for other batch systems (LSF, Condor, etc.)."
>>
>>
>> Would appreciate your inputs on this.
>>
>> Thanks
>> E
>>
>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/IPython-LSF-support-tp32375842p32375842.html
>> Sent from the IPython - User mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> IPython-User mailing list
>> IPython-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/ipython-user
>>
>
> _______________________________________________
> IPython-User mailing list
> IPython-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-user
>
>
--
View this message in context: http://old.nabble.com/IPython-LSF-support-tp32375842p32420689.html
Sent from the IPython - User mailing list archive at Nabble.com.
More information about the IPython-User
mailing list