[IPython-User] Can't get ipcluster to run on LAN

Cavendish McKay cavendish.mckay@gmail....
Wed Aug 8 08:46:46 CDT 2012


I have been experiencing a similar problem with the SSH launcher on
0.14-dev.   For me, the problem seems to be related to the
--profile-dir command line argument.
As you describe, the first error messages I got seemed to indicate
that there was an addressing problem, but using --log-level=DEBUG
showed that the controller was actually dying before the engines could
try to connect.  This was only the case when using the ipcluster
command, not when using ipcontroller and ipengine separately.

Here was the relevant chunk of output, for reference:

2012-08-07 12:15:47.815 [IPClusterStart] Starting Controller with
SSHControllerLauncher
2012-08-07 12:15:47.816 [IPClusterStart] Starting
SSHControllerLauncher: ['ssh', '-tt', u'atlacamani-eth3',
'ipcontroller', '--profile-dir', u'', '--log-to-file',
'--log-level=20', '--ip=192.168.0.100']
2012-08-07 12:15:47.818 [IPClusterStart] Process 'ssh' started: 24977
2012-08-07 12:15:48.166 [IPClusterStart] usage: ipcontroller [-h]
[--ip HUBFACTORY.IP]
2012-08-07 12:15:48.166 [IPClusterStart]
[--enginessh IPCONTROLLERAPP.ENGINE_SSH_SERVER]
2012-08-07 12:15:48.166 [IPClusterStart]
[--profile-dir PROFILEDIR.LOCATION]
2012-08-07 12:15:48.166 [IPClusterStart]
[--work-dir IPCONTROLLERAPP.WORK_DIR]
2012-08-07 12:15:48.166 [IPClusterStart]                     [--port
HUBFACTORY.REGPORT]
2012-08-07 12:15:48.166 [IPClusterStart]
[--transport HUBFACTORY.TRANSPORT]
2012-08-07 12:15:48.166 [IPClusterStart]
[--log-to-file [IPCONTROLLERAPP.LOG_TO_FILE]]
2012-08-07 12:15:48.167 [IPClusterStart]                     [--ping
HEARTMONITOR.PERIOD]
2012-08-07 12:15:48.167 [IPClusterStart]
[--location IPCONTROLLERAPP.LOCATION]
2012-08-07 12:15:48.167 [IPClusterStart]
[--clean-logs IPCONTROLLERAPP.CLEAN_LOGS]
2012-08-07 12:15:48.167 [IPClusterStart]                     [--scheme
TASKSCHEDULER.SCHEME_NAME]
2012-08-07 12:15:48.167 [IPClusterStart]
[--profile IPCONTROLLERAPP.PROFILE]
2012-08-07 12:15:48.167 [IPClusterStart]
[--cluster-id IPCONTROLLERAPP.CLUSTER_ID]
2012-08-07 12:15:48.167 [IPClusterStart]                     [--hwm
TASKSCHEDULER.HWM]
2012-08-07 12:15:48.167 [IPClusterStart]                     [--ssh
IPCONTROLLERAPP.SSH_SERVER]
2012-08-07 12:15:48.167 [IPClusterStart]                     [--ident
SESSION.SESSION]
2012-08-07 12:15:48.167 [IPClusterStart]
[--ipython-dir IPCONTROLLERAPP.IPYTHON_DIR]
2012-08-07 12:15:48.167 [IPClusterStart]                     [--url
HUBFACTORY.URL]
2012-08-07 12:15:48.167 [IPClusterStart]
[--log-level IPCONTROLLERAPP.LOG_LEVEL]
2012-08-07 12:15:48.168 [IPClusterStart]
[--log-url IPCONTROLLERAPP.LOG_URL]
2012-08-07 12:15:48.168 [IPClusterStart]
[--keyfile SESSION.KEYFILE] [--user SESSION.USERNAME]
2012-08-07 12:15:48.168 [IPClusterStart]
[--no-secure] [--restore] [--usethreads] [--reuse]
2012-08-07 12:15:48.168 [IPClusterStart]                     [--nodb]
[--mongodb] [--sqlitedb] [--quiet] [--dictdb]
2012-08-07 12:15:48.168 [IPClusterStart]                     [--init]
[--debug] [--secure]
2012-08-07 12:15:48.168 [IPClusterStart] ipcontroller: error: argument
--profile-dir: expected one argument
2012-08-07 12:15:48.183 [IPClusterStart] Connection to atlacamani-eth3 closed.
2012-08-07 12:15:48.184 [IPClusterStart] Process 'ssh' stopped:
{'pid': 24977, 'exit_code': 2}
2012-08-07 12:15:48.184 [IPClusterStart] IPython cluster: stopping
2012-08-07 12:15:48.815 [IPClusterStart] Starting 80 Engines with SSH
2012-08-07 12:15:48.817 [IPClusterStart] Starting SSHEngineLauncher:
['ssh', '-tt', u'atlacamani309', '/usr/bin/python', '-c', 'from
IPython.parallel.apps.ipengineapp import launch_new_instance;
launch_new_instance()', '--profile-dir', u'', '--log-to-file',
'--log-level=20']
(etc.)

So, evidently ipcontroller doesn't like having an empty string passed
in following --profile_dir.  I fixed that by specifying profile_dir in
ipcluster_config.py.  Then the controller started without problems,
but the engines still choked:

(trimmed, because it's trying to start 80 engines and they're all the same)

2012-08-07 12:38:41.578 [IPClusterStart] Starting SSHEngineLauncher:
['ssh', '-tt', u'atlacamani305', '/usr/bin/python', '-c', 'from
IPython.parallel.apps.ipengineapp import launch_new_instance;
launch_new_instance()', '--profile-dir',
u'/home/cmckay/.ipython/profile_cluster', '--log-to-file',
'--log-level=20']
2012-08-07 12:38:41.583 [IPClusterStart] Process 'ssh' started: 25587
2012-08-07 12:38:42.494 [IPClusterStart] bash: -c: line 0: syntax
error near unexpected token `--profile-dir'
2012-08-07 12:38:42.494 [IPClusterStart] Connection to atlacamani305 closed.
2012-08-07 12:38:42.510 [IPClusterStart] Process 'ssh' stopped:
{'pid': 25587, 'exit_code': 1}

(etc.)

So, summing up:

1. There is a note in ipcluster_config.py that both SSHEngineLauncher
and SSHControllerLauncher inherit config from SSHClusterLauncher and
SSHLauncher.

2. SSHClusterLauncher adds ['--profile-dir', self.remote_profile_dir]
to the arguments.

3. It seems that if profile_dir isn't specified in
ipcluster_config.py, it doesn't get generated properly (hence the
error message upon starting ipcontroller, above), and

4. The way the command is being put together, these args don't seem to
be getting to ipengine via ssh (hence the error message related to
starting ipengine, above). This error message is coming from bash,
rather than from ipengine, and it looks to me like SSHEngineLauncher
isn't actually using the ipengine script, so maybe the command line
arguments aren't appropriate here anyway.

I'm not (yet) well enough versed in the internals to figure out the
right way to fix this, or even if it's just a configuration error on
my part.  Also, this may not be the problem you're having, but it
seemed similar enough that I would chime in.

Cavendish





On Wed, Aug 8, 2012 at 3:42 AM, Johann Rohwer <jr@sun.ac.za> wrote:
> On Tuesday 07 August 2012 16:03:52 MinRK wrote:
>
>> Try removing the old connection files (in `ipython locate profile
>> home_ssh`/security).
>>
>> There might be an issue when reuse_files=True, that sometimes the
>> old
>> connection files can set the config, so if you ran once with
>> ip=127.0.0.1, then set reuse_files=True, it's possible that
>> 127.0.0.1 is still being used.
>>
>> Can you verify the contents of the JSON files?
>
> Initially I did not use reuse_files, and indeed I did verify that the
> JSON files were indeed deleted and re-created every time when starting
> ipcluster.  So this was not the issue.
>
> How does IPcluster obtain the hostname/IP address of the controller if
> not specified? Because this is a local lan, the assigned IP's from the
> DHCP server are in the 192.168.0.* subnet and there is of course no
> DNS/FQDN. My problem was that the specifications in the config files
> seemed to be "ignored"/overridden by the internal ipcluster methods to
> assign the hostname/IP. Does it look at /etc/hosts? This has of course
> the specification of 127.0.0.1 for localhost.
>
> --Johann
> E-pos vrywaringsklousule
>
> Hierdie e-pos mag vertroulike inligting bevat en mag regtens geprivilegeerd wees en is slegs bedoel vir die persoon aan wie dit geadresseer is. Indien u nie die bedoelde ontvanger is nie, word u hiermee in kennis gestel dat u hierdie dokument geensins mag gebruik, versprei of kopieer nie. Stel ook asseblief die sender onmiddellik per telefoon in kennis en vee die e-pos uit. Die Universiteit aanvaar nie aanspreeklikheid vir enige skade, verlies of uitgawe wat voortspruit uit hierdie e-pos en/of die oopmaak van enige lês aangeheg by hierdie e-pos nie.
>
> E-mail disclaimer
>
> This e-mail may contain confidential information and may be legally privileged and is intended only for the person to whom it is addressed. If you are not the intended recipient, you are notified that you may not use, distribute or copy this document in any manner whatsoever. Kindly also notify the sender immediately by telephone, and delete the e-mail. The University does not accept liability for any damage, loss or expense arising from this e-mail and/or accessing any files attached to this e-mail.
>
> _______________________________________________
> IPython-User mailing list
> IPython-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-user
>


More information about the IPython-User mailing list