[IPython-User] parallel ssh problems

Toby Burnett tburnett@uw....
Mon Oct 24 10:13:50 CDT 2011


Thanks for the clarifications, but my complaint was really that I don’t understand why it started the engines, and then shut them down. Are those tcgetattr messages significant?

--Toby

From: MinRK [mailto:benjaminrk@gmail.com]
Sent: Sunday, October 23, 2011 14:24
To: Toby Burnett
Cc: ipython-user@scipy.org
Subject: Re: [IPython-User] parallel ssh problems


On Sun, Oct 23, 2011 at 11:38, Toby Burnett <tburnett@uw.edu<mailto:tburnett@uw.edu>> wrote:
Sorry, after reading the instructions, I realized that I set the wrong value, but there is some confusion between the online help and instructions in the generated config-ssh/ipcluster_config.py, so I put in both lines.

Argh, when in conflict, the online docs are out of date.  I'll update them now.  The default config files are automatically generated from the configurable objects,
so a fresh `ipython profile create <name> --parallel` can't be out of date.


c.IPClusterEngines.engine_launcher_class = 'SSHEngineSetLauncher'
c.IPClusterEngines.engine_launcher = 'IPython.parallel.apps.launcher.SSHEngineSetLauncher'

Changes from 0.11-0.12:
  * added the _class to be more clear
  * allowed launchers from IPython.parallel.apps.launcher to be specified by classname only, for convenience.  In fact, you can now just specify 'SSH' or 'MPIExec', and it will resolve to 'IPython.parallel.apps.launcher.SSHFooLauncher'.

Obviously, trying to clarify things without updating the docs is not a great success.  What you have is exactly right for a config file to work on both 0.11 and 0.12.  I will add a deprecation warning on the old name, so that users moving from 0.11 to 0.12 get some help, and some more detail to docs and helpstrings, to hopefully avoid future confusion.


and I set c.SSHEngineSetLauncher.engines to {'tev01':4}, another machine from the one I ran ipcluster
The results follow: the last line is very confusing; I have no idea where it got the non-extentent machine names.

ha, that's just a poor choice on my part.  When you start multiple engines on a single host, their keys in the dict that tracks them (which you are seeing in the log message) will be 'host0', 'host1', 'host2', etc..  Obviously, that doesn't sit well with nodeNN machine naming, because they still look like machine names.  I'll add a '/' separator, so it's clearer that these are four engines on 'tev01', not one engine each on 'tev011' etc.


tev11:~/analysis[878]$ipcluster start --profile=ssh &
[IPClusterStart] Using existing profile dir: u'/phys/users/tburnett/.ipython/profile_ssh'
will start the following engines: {'tev01': 4}
[IPClusterStart] Starting ipcluster with [daemon=False]
[IPClusterStart] Creating pid file: /phys/users/tburnett/.ipython/profile_ssh/pid/ipcluster.pid
[IPClusterStart] Starting LocalControllerLauncher: ['/phys/users/olsont/TEV/Glast/python27/bin/python2.7', u'/phys/users/olsont/TEV/Glast/python27/lib/python2.7/site-packages/ipython-0.11-py2.7.egg/IPython/parallel/apps/ipcontrollerapp.py', '--log-to-file', '--log-level=20', u'--profile-dir=/phys/users/tburnett/.ipython/profile_ssh']
[IPClusterStart] Process '/phys/users/olsont/TEV/Glast/python27/bin/python2.7' started: 20849
[IPClusterStart] [IPControllerApp] Using existing profile dir: u'/phys/users/tburnett/.ipython/profile_ssh'
[IPClusterStart] Scheduler started [leastload]
[IPClusterStart] Starting 24 engines
[IPClusterStart] Process 'ssh' started: 20868
[IPClusterStart] Starting SSHEngineSetLauncher: ['ssh', '-tt', u'tburnett@tev01', '/phys/users/olsont/TEV/Glast/python27/bin/python2.7', u'/phys/users/olsont/TEV/Glast/python27/lib/python2.7/site-packages/ipython-0.11-py2.7.egg/IPython/parallel/apps/ipengineapp.py', '--log-to-file', '--log-level=20']
[IPClusterStart] Process 'ssh' started: 20869
[IPClusterStart] Process 'ssh' started: 20870
[IPClusterStart] Process 'ssh' started: 20871
[IPClusterStart] Process 'engine set' started: [None, None, None, None]
[IPClusterStart] tcgetattr: Invalid argument
[IPClusterStart] tcgetattr: Invalid argument
[IPClusterStart] tcgetattr: Invalid argument
[IPClusterStart] tcgetattr: Invalid argument
[IPClusterStart] [IPEngineApp] Using existing profile dir: u'/phys/users/tburnett/.ipython/profile_default'
[IPClusterStart] [IPEngineApp] Using existing profile dir: u'/phys/users/tburnett/.ipython/profile_default'
[IPClusterStart] [IPEngineApp] Using existing profile dir: u'/phys/users/tburnett/.ipython/profile_default'
[IPClusterStart] [IPEngineApp] Using existing profile dir: u'/phys/users/tburnett/.ipython/profile_default'
[IPClusterStart] Connection to tev01 closed.
[IPClusterStart] Process 'ssh' stopped: {'pid': 20870, 'exit_code': 255}
[IPClusterStart] Connection to tev01 closed.
[IPClusterStart] Process 'ssh' stopped: {'pid': 20869, 'exit_code': 255}
[IPClusterStart] Connection to tev01 closed.
[IPClusterStart] Process 'ssh' stopped: {'pid': 20868, 'exit_code': 255}
[IPClusterStart] Connection to tev01 closed.
[IPClusterStart] Process 'ssh' stopped: {'pid': 20871, 'exit_code': 255}
[IPClusterStart] Process 'engine set' stopped: {'tev012': {'pid': 20870, 'exit_code': 255}, 'tev013': {'pid': 20871, 'exit_code': 255}, 'tev010': {'pid': 20868, 'exit_code': 255}, 'tev011': {'pid': 20869, 'exit_code': 255}}

_______________________________________________
IPython-User mailing list
IPython-User@scipy.org<mailto:IPython-User@scipy.org>
http://mail.scipy.org/mailman/listinfo/ipython-user

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/ipython-user/attachments/20111024/bd368239/attachment-0001.html 


More information about the IPython-User mailing list