[IPython-User] ipcluster in ssh mode -

Manuel Jung mjung@astrophysik.uni-kiel...
Fri Aug 12 05:44:29 CDT 2011


Hi,

So i have been browsing around the sources, looking for another solution, to
make my use case built in ipcluster, because i were feeling stupid for
writing a script to setup the cluster, if this is, what ipcluster should do
for me.
So this is my solution:

Since we are on a totaly restricted network/pc and ports are never to be
reached (execpt for ssh/22) outside of localhost, it is totally save to
choose static ports like you suggested in your first post in this thread.

c.LocalControllerLauncher.controller_args = ['--log-level=20',
'--ip=0.0.0.0', '--location=127.0.0.1', '--port=10101',
'--HubFactory.hb=10102,10112', 'HubFactory.control=10203,10103',
'--HubFactory.mux=10204,10104', '--HubFactory.task=10205,10105']

For tunneling from the engines' host, i have implemented an additional
parameter for the SSHEngineSetLauncher. It allows to run a shell command on
the engines' host. In this case it is used to establish all tunnels.

tunnel = ['ssh dwarf20 -N -L10101:127.0.0.1:10101
-L10102:127.0.0.1:10102-L10112:
127.0.0.1:10112 -L10103:127.0.0.1:10103
-L10104:127.0.0.1:10104-L10105:127.0.0.1:10105
'.split()]
c.SSHEngineSetLauncher.engines = {'pluto' : (16, None, tunnel),
                                                       'merkur' : (4, None,
tunnel)}

(dwarf20 is the cluster starting client an controller hosting pc, pluto and
merkur servers for number crunching, e.g. engines' hosts.)

Let me say at this point, that establishing tunnels for all ports in one
command isn't always a good idea, because they share the same tcp
connections and bandwidth is restricted on a per connection basis. So maybe
this may be a bottleneck under high load.

Still this is not enough for getting all connections working. On pluto with
16 cores i experienced often less than 16 successfull connected engines. I
found, that simultaneous authentications to an sshd are restricted to 10 by
the MaxStartups parameter (see man sshd_config(5)). So i introduced a  new
parameter for delaying consecutive ssh connections.

c.SSHEngineSetLauncher.delay = 0.2

The complete setup from ipcluster_config.py can be found in the post
scriptum. I have created a branch on github for this, see
https://github.com/gzahl/ipython/tree/sshenvironment

This works for me at the moment, what do you think about this solution.

Two last thoughts:
- It would be nice, if one wouldn't have to specify the port configuration
an tunnel command explicit. It would be nice if you could only define the
ports and activate tunneling=yes. But i'm not sure how this could be done
best - yet.
- I have to define '--profile=ssh' in the program_args for the
SSHEngineLauncher - shouldn't this be automaticly choosen, if i'm starting
with "ipcluster start --profile=ssh"? It seems like a bug to me?
- I were testing with the ControlMaster feature of SSH (version 4 or
greater). It reuses a existing tcp connection and can speed up new ssh
connections. But one would ran into the only-one-tcp-connection issues
again. Do you know this command? I'm not sure if it is of use in this case.
But it could help to lower the SSHEngineSetLauncher.delay parameter.

Cheers
Manuel


ipcluster_config.py:

c = get_config()
c.IPClusterStart.engine_launcher_class = 'SSHEngineSetLauncher'
c.IPClusterStart.delay = 2.0
c.LocalControllerLauncher.controller_args = ['--log-level=20',
'--ip=0.0.0.0', '--location=127.0.0.1', '--port=10101',
'--HubFactory.hb=10102,10112', 'HubFactory.control=10203,10103',
'--HubFactory.mux=10204,10104', '--HubFactory.task=10205,10105']
# Are hard coded paths really a reasonable default? On my systems this
doesn't make much sense.
c.LocalControllerLauncher.controller_cmd = ['ipcontroller']
c.SSHEngineLauncher.program = ['ipengine']
c.SSHEngineLauncher.program_args = ['--log_level=20', '--profile=ssh']
c.SSHEngineSetLauncher.engine_args = ['--log-level=20', '--profile=ssh']
c.SSHEngineSetLauncher.delay = 0.2
tunnel = ['ssh dwarf20 -N -L10101:127.0.0.1:10101
-L10102:127.0.0.1:10102-L10112:
127.0.0.1:10112 -L10103:127.0.0.1:10103
-L10104:127.0.0.1:10104-L10105:127.0.0.1:10105
'.split()]
c.SSHEngineSetLauncher.engines = {'pluto' : (16, None, tunnel),
                                                       'merkur' : (4, None,
tunnel)}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/ipython-user/attachments/20110812/c4a70af6/attachment-0001.html 


More information about the IPython-User mailing list