[IPython-User] Can't get ipcluster to run on LAN
Johann Rohwer
jr@sun.ac...
Mon Aug 6 03:43:59 CDT 2012
I'm running a small LAN at home with an ADSL router acting as DHCP
server for the LAN. There's my laptop on 192.168.0.4 and a server
(host0) on 192.168.0.2 both running Ubuntu 12.04 (home directory is
*NOT* shared). I've been struggling to get ipcluster over SSH going as
somehow the internal ipcluster magic always returns 127.0.0.1 (i.e.
the loopback interface) for the laptop's IP and not 192.168.0.4, which
means then that the engines on host0 won't find the controller on the
laptop just using the default configuration. In the end I specified the
IP addresses in the configuration files and opted for re-useable json
files, but still ipcluster won't start. Strangely enough starting
ipcontroller on the laptop and the engines on laptop and host0
separately, the cluster starts fine with the same configuration files.
The logs are not informative. Why would ipcluster not work here?
ipython version: 0.13 installed with pip, on both laptop and host0.
profile name: ssh_home (on both laptop and host0)
--------------------------------------------------
Manually edited configuration file options:
--------------------------------------------------
ipcluster_config.py:c.IPClusterStart.engine_launcher_class =
'SSHEngineSetLauncher'
ipcluster_config.py:c.SSHLauncher.location = '192.168.0.4'
ipcluster_config.py:c.SSHLauncher.to_send = []
ipcluster_config.py:c.SSHEngineSetLauncher.engine_args = ['--profile-
dir=/home/jr/.config/ipython/profile_ssh_home']
ipcluster_config.py:c.SSHEngineSetLauncher.engines = { '192.168.0.4':
2, '192.168.0.2': 2 }
ipcontroller_config.py:c.IPControllerApp.reuse_files = True
ipcontroller_config.py:c.HubFactory.client_ip = '*'
ipcontroller_config.py:c.HubFactory.ip = '*'
ipcontroller_config.py:c.HubFactory.location = '192.168.0.4'
ipcontroller_config.py:c.HubFactory.engine_ip = '*'
(none in ipengine_config.py)
--------------------------------------------------
Console output on laptop when starting ipcluster:
--------------------------------------------------
$ ipcluster start --profile=ssh_home
2012-08-06 10:02:46,132.132 [IPClusterStart] Using existing profile
dir: u'/home/jr/.config/ipython/profile_ssh_home'
2012-08-06 10:02:46.203 [IPClusterStart] Starting ipcluster with
[daemon=False]
2012-08-06 10:02:46.204 [IPClusterStart] Creating pid file:
/home/jr/.config/ipython/profile_ssh_home/pid/ipcluster.pid
2012-08-06 10:02:46.204 [IPClusterStart] Starting Controller with
LocalControllerLauncher
2012-08-06 10:02:47.204 [IPClusterStart] Starting 4 Engines with
SSHEngineSetLauncher
2012-08-06 10:02:53.278 [IPClusterStart]
Engines shutdown early, they probably failed to connect.
Check the engine log files for output.
If your controller and engines are not on the same
machine, you probably
have to instruct the controller to listen on an interface
other than localhost.
You can set this by adding "--ip='*'" to your
ControllerLauncher.controller_args.
Be sure to read our security docs before instructing your
controller to listen on
a public interface.
2012-08-06 10:02:53.286 [IPClusterStart] IPython cluster: stopping
2012-08-06 10:02:56.287 [IPClusterStart] Removing pid file:
/home/jr/.config/ipython/profile_ssh_home/pid/ipcluster.pid
--------------------------------------------------
ipcontroller log on laptop:
--------------------------------------------------
2012-08-06 10:02:46.655 [IPControllerApp] loading connection info from
/home/jr/.config/ipython/profile_ssh_home/security/ipcontroller-
engine.json
2012-08-06 10:02:46.694 [IPControllerApp] loading connection info from
/home/jr/.config/ipython/profile_ssh_home/security/ipcontroller-
client.json
2012-08-06 10:02:46.734 [IPControllerApp] Hub listening on
tcp://*:47818 for registration.
2012-08-06 10:02:46.736 [IPControllerApp] Hub using DB backend: 'NoDB'
2012-08-06 10:02:46.994 [IPControllerApp] hub::created hub
2012-08-06 10:02:46.995 [IPControllerApp] task::using Python leastload
Task scheduler
2012-08-06 10:02:46.996 [IPControllerApp] Heartmonitor started
2012-08-06 10:02:47.035 [IPControllerApp] Creating pid file:
/home/jr/.config/ipython/profile_ssh_home/pid/ipcontroller.pid
2012-08-06 10:02:53.279 [IPControllerApp] Received signal 2, shutting
down
2012-08-06 10:02:53.280 [IPControllerApp] terminating children...
--------------------------------------------------
Console output starting ipcontroller separately on laptop and
ipengines on laptop and host0:
--------------------------------------------------
$ ipcontroller --profile=ssh_home
2012-08-06 10:11:27,128.128 [IPControllerApp] Using existing profile
dir: u'/home/jr/.config/ipython/profile_ssh_home'
2012-08-06 10:11:27.132 [IPControllerApp] loading connection info from
/home/jr/.config/ipython/profile_ssh_home/security/ipcontroller-
engine.json
2012-08-06 10:11:27.132 [IPControllerApp] loading connection info from
/home/jr/.config/ipython/profile_ssh_home/security/ipcontroller-
client.json
2012-08-06 10:11:27.140 [IPControllerApp] Hub listening on
tcp://*:47818 for registration.
2012-08-06 10:11:27.142 [IPControllerApp] Hub using DB backend: 'NoDB'
2012-08-06 10:11:27.399 [IPControllerApp] hub::created hub
2012-08-06 10:11:27.400 [IPControllerApp] task::using Python leastload
Task scheduler
2012-08-06 10:11:27.401 [IPControllerApp] Heartmonitor started
2012-08-06 10:11:27.424 [IPControllerApp] Creating pid file:
/home/jr/.config/ipython/profile_ssh_home/pid/ipcontroller.pid
2012-08-06 10:11:27.434 [scheduler] Scheduler started [leastload]
on laptop:
--------------------------------------------------
$ ipengine --profile=ssh_home
2012-08-06 10:12:11,390.390 [IPEngineApp] Using existing profile dir:
u'/home/jr/.config/ipython/profile_ssh_home'
2012-08-06 10:12:11.392 [IPEngineApp] Loading url_file
u'/home/jr/.config/ipython/profile_ssh_home/security/ipcontroller-
engine.json'
2012-08-06 10:12:11.420 [IPEngineApp] Registering with controller at
tcp://192.168.0.4:47818
2012-08-06 10:12:11.883 [IPEngineApp] Completed registration with id 0
on host0:
--------------------------------------------------
$ ipengine --profile=ssh_home
2012-08-06 10:13:44,950.950 [IPEngineApp] Using existing profile dir:
u'/home/jr/.config/ipython/profile_ssh_home'
2012-08-06 10:13:44.953 [IPEngineApp] Loading url_file
u'/home/jr/.config/ipython/profile_ssh_home/security/ipcontroller-
engine.json'
2012-08-06 10:13:44.967 [IPEngineApp] Registering with controller at
tcp://192.168.0.4:47818
2012-08-06 10:13:45.203 [IPEngineApp] Completed registration with id 1
--------------------------------------------------
Sorry for the long email, but I have no clue why ipcluster won't work.
Any ideas?
--Johann
P.S. When starting the cluster in this mode, do the ipengines have to
be killed manually? When terminating ipcontroller with Ctrl-C, I get
the notification
[IPControllerApp] terminating children...
but the ipengine processes remain running.
More information about the IPython-User
mailing list