[IPython-User] [IPython-user] furls only have localhost as ipcontroller location

Justin Riley justin.t.riley@gmail....
Thu Aug 12 09:30:21 CDT 2010


Keith,

> The first problem I ran into was that the ipython, ipengine,
> ipcontroller and ipcluster scripts had the wrong shell command.
> the new version gave:
>
> #! /share/apps/jtriley-ipython-e4e96f6

Hmmm that's a strange installation issue. I'm using virtualenv and my
ipython/ipengine/ipcontroller/ipcluster scripts have the proper shell
line above. How exactly did you install the 0.10.1-sge branch from my
fork on github to get this line?

> It did look as if an array job was started in SGE, but since the
> ipengines could not find ipcontroller it died.

So it seems the SGE launch script is working but you're still plagued by
the localhost-only issue in your engine furl files.

I just checked my engine furl files and found the same thing Brian
mentioned in a previous email, mine contain BOTH 127.0.0.1 and my public
ip. Checking the code, the kernel config defaults to using all
interfaces. I haven't modified any ipython configs either. Have you
tried backing up your ~/.ipython dir and recreating it just to rule out
a config issue?

The only other thing I can think of at the moment is to check your
version of foolscap:

$ python -c "import foolscap; print foolscap.__version__"

I'm using 0.5.1. Please try this version if you're using something else.

> Our cluster is going to be taken off the grid pretty soon.  If you make
> some more changes please try to get them to me as soon as you can.

Basically we need to get to the bottom of this localhost/furl file issue
and things should work for you assuming your ~/.ipython/security folder
is shared across nodes. I'll be on the #ipython channel today on
freenode IRC if you want to try to figure this out interactively.

~Justin


On 08/11/2010 06:20 PM, Keith C Smith wrote:
> 
> 
> 
> Brian/Justin:
> 
> I installed the  SGE enabled IPython.
> 
> The first problem I ran into was that the ipython, ipengine,
> ipcontroller and ipcluster scripts had the wrong shell command.  I
> installed into a /share/apps directory so instead of:
> #! /share/apps/bin/python
> 
> the new version gave:
> 
> #! /share/apps/jtriley-ipython-e4e96f6
> 
> Which generated a bad interpreter error
> 
> Once I fixed that I ran  "ipcluster sge -n 40" a few times with
> different -g options, but ran into the same furl file problem I had
> before where the furls only have the local host(127.0.0.1)as the
> ipcontroller location.
> 
> Since ipcluster does not pass "--engine-location=" or
> "--client-location" on to ipcontroller I am stuck.
> 
> My original qsub ipcontroller mpiexec script file still works as I set
> "--engine-location=" and "--client-location"
> 
> It did look as if an array job was started in SGE, but since the
> ipengines could not find ipcontroller it died.
> 
> Our cluster is going to be taken off the grid pretty soon.  If you make
> some more changes please try to get them to me as soon as you can.
> 
> Thanks, I really appreciate your efforts and IPython
> 
> On 8/9/2010 12:38 PM, Brian Granger wrote:
>> Have you tried the native SGE support I emailed you about yet?
>>
>> Brian
>>
>> On Mon, Jul 26, 2010 at 4:09 PM, kcsmith <kcsmith@raytheon.com> wrote:
>>
>>> I'm trying to run ipcluster under the sun grid engine on a 10 node cluster
>>> and I encountered the following error.
>>>
>>> Only those ipengines which reside on the same node as ipcontroller connect.
>>> The rest get CONNECTION REFUSED[111] errors.
>>>
>>> I traced this problem down to the furl files that ipcontroller creates.
>>> They only have the local host ip address listed.
>>> pb://
>>> d2vqoq6l7tmjtdjl4gi2ctwlwbxzzdc2@127.0.0.1:56104/ei4yhcb5qqa3pyyoi32j3guqfkzqtd5q
>>>
>>> If I manually add the actual ipcontroller node's ip address to the furl
>>> then
>>> everything works, ipengines connect and the client connects.
>>>
>>> i.e.
>>>
>>> pb://
>>> d2vqoq6l7tmjtdjl4gi2ctwlwbxzzdc2@10.0.255.234:56104/ei4yhcb5qqa3pyyoi32j3guqfkzqtd5q
>>>
>>> When ipcontroller is started on 10.0.255.234
>>>
>>> Is there some system setting or environment variable which can be set to
>>> force foolscap to include the ipcontroller node ip address?  Or is there
>>> something else wrong??
>>>
>>> Thanks,
>>> Keith
>>> --
>>> View this message in context:
>>> http://old.nabble.com/furls-only-have-localhost-127.0.0.1--tp29271660p29271660.html
>>> Sent from the IPython - User mailing list archive at Nabble.com.
>>>
>>> _______________________________________________
>>> IPython-User mailing list
>>> IPython-User@scipy.org
>>> http://mail.scipy.org/mailman/listinfo/ipython-user
>>>
>>
>>



More information about the IPython-User mailing list