[IPython-User] Parallel ipython over ssh+NFS
Jose Gomez-Dans
jgomezdans@gmail....
Tue Jun 12 12:10:34 CDT 2012
Hi,
So I have ipython 0.11 in our system, and I would like to do some
trivial parallel processing, making use of ssh and the fact that we
have NFS in our home directories (i.e. ~/.config/ipython is visible in
all hosts in the same place).
So, I followed the instructions here
<http://mail.scipy.org/pipermail/ipython-user/2011-December/008833.html>
1.- Create a new test profile using
$ ipython profile create sshtest --parallel
2.- Edit ipcluster_config.py to look like this
c = get_config()
# delay 60 seconds between starting the controller, and starting the engines
c.IPClusterStart.delay = 60
# start engines with SSH
c.IPClusterEngines.engine_launcher = \
'IPython.parallel.apps.launcher.SSHEngineSetLauncher'
# You only need to use the SSHController launcher if you are *not*
# running ipcluster from the controller machine
c.IPClusterStart.controller_launcher = \
'IPython.parallel.apps.launcher.SSHControllerLauncher'
c.SSHControllerLauncher.hostname = 'my_hostname'
# this and above should result in
# $> ssh you@conlin.lanl.gov "ipcontroller --ip=128.165...
#
# add `--reuse` to this only *after* everything appears to be working,
#because it can prevent
# certain changes from having an effect
c.SSHControllerLauncher.program_args = [ '--ip=xx.xx.xx.xx',
'--log-to-file', '--log-level=10' ]
# if you are not using SSH to launch the controller:
c.LocalControllerLauncher.controller_args = [ '--ip=xx.xx.xx.xx',
'--log-to-file', '--log-level=10']
c.SSHEngineSetLauncher.engines = {
'sun-node01': 2,
'sun-node02': 2, # etc.
}
3.- Test running the controller on my_hostname
$ ipcontroller --profile=sshtest
4.- Test connecting an engine from one node (say sun-node02 above):
$ ipengine --profile=sshtest
5.- Steps (3) and (4) are succesful (there's a connection, and I can
pass things to the engine, see it's IP address etc)
Now, I would like to just use ipcluster to launch the controller and
engines. This doesn't work as expected: if I launch
$ ipcluster --profile=sshtest
then after ~60s, all I get is ipcluster launching 12 engines on the
controller (the controller has 12 cores, so this might be a default).
However, they are all local. I can ssh into the nodes and launch the
engines using ipcluster engine profile=sshtest. However, this launches
4 engines per node (and not 2, these are 4 core machines).
I really don't understand what's going on here, but my problem appears
similar to the one Jeremy reported. However, I can only use two cores
in the nodes, so ipcluster engine etc doesn't work for me.
Additionally, how would one go about giving the engines a particular
"nice" value? If I don't sort this stuff out, I think I might become
very unpopular among my colleagues! ;-)
Thanks!
Jose
More information about the IPython-User
mailing list