[IPython-dev] client connection in shared filesystem for a farm of machines

MinRK benjaminrk@gmail....
Wed Jun 22 13:41:42 CDT 2011


On Wed, Jun 22, 2011 at 08:39, Johann Cohen-Tanugi
<johann.cohentanugi@gmail.com> wrote:
> Hello,
> I am using a farm of several machines, and I log in to it using ssh and
> a generic farm name, ending up on one specific machine depending on
> loads etc.... All the machines of the farm see the AFS filesystem, on
> which a json file was created when I fired ipcluster after a first login
> to the farm.
> Then I start another terminal and log in again to the farm, ending up on
> another machine than the one the ipcluster is running on.
> If I then do :
> from IPython.parallel import Client
> c = Client()
>
> it hangs.... If I do:
> c =
> Client('/u/ec/cohen/.config/ipython/profile_default/security/ipcontroller-client.json')
> it returns :
> ---------------------------------------------------------------------------
> TypeError                                 Traceback (most recent call last)
> /a/wain006/g.glast.u54/cohen/IPYDEV/test_directory/<ipython-input-3-9325a79c5ee0>
> in <module>()
> ----> 1 c =
> Client('/u/ec/cohen/.config/ipython/profile_default/security/ipcontroller-client.json')
>
> TypeError: __new__() takes exactly 1 argument (2 given)
>
> so I guess the doc in
> http://ipython.org/ipython-doc/dev/parallel/parallel_intro.html#getting-started
> needs a patch.

The docs are right - I just introduced a bug when I made the Client
inherit from HasTraits.  I pushed a simple fix to master, so it does
accept positional arguments.
I also included a fix for the 'hang', which is actually a units
problem in pyzmq's select - trying to connect to a nonexistent
controller will now timeout after 10 seconds (timeout is an arg in the
Client constructor, so you can make it shorter if you like).

>
> Finally if I do :
> c =
> Client(url_or_file='/afs/slac/u/ec/cohen/.config/ipython/profile_default/security/ipcontroller-client.json')
> it also hangs, while on the same ipython session I can immediately check :
> In [8]: ls -ltr
> /afs/slac/u/ec/cohen/.config/ipython/profile_default/security/ipcontroller-client.json
> -rw------- 1 cohen ec 130 Jun 22 08:06
> /afs/slac/u/ec/cohen/.config/ipython/profile_default/security/ipcontroller-client.json
>
> that indeed the JSON file is accessible via the AFS file sharing.
>
>
> I checked that if I forced connecting to the same machine instead of
> using the generic farm name,
> c=Client() immediately returns with the engines attached and I can
> proceed normally.

The Controller only listens on loopback by default for security
reasons. If you want to connect to a different machine, you must
instruct the Controller to listen on a public interface (e.g.
ip=0.0.0.0), which you should only do if your cluster is safely behind
a firewall. Otherwise, you must use ssh tunnels to connect to the
Controller, via the Client's 'ssh' arg.

-MinRK

>
> Thanks in advance for the help,
> johann
> _______________________________________________
> IPython-dev mailing list
> IPython-dev@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev
>


More information about the IPython-dev mailing list