[IPython-User] [IPython-user] furls only have localhost as ipcontroller location

kcsmith kcsmith@raytheon....
Tue Jul 27 12:06:28 CDT 2010


I got it to work by NOT using ipcluster

For those who care, here's the script I submit via qsub to the Sun Gird
Engine on an 80 core, 10 compute node Linux Rocks cluster
-----------------------------------------------------------
#!/bin/bash
#$ -cwd
#$ -pe Common 40
#$ -j y
#$ -S /bin/bash
echo "Starting..."
echo $HOSTNAME
echo $NSLOTS
# Note:  Some of the following may not be needed
export TMP=/tmp
export TMPDIR=/tmp
export MPI_DIR=/opt/openmpi/
PATH=$PATH:/opt/openmpi/bin:/share/apps/bin:/share/apps/lib
export PATH
# Note: Sun Grid Engine will pick a compute node to run this on (i.e. NOT
the head node under Rocks)
ipcontroller -r --client-location=$HOSTNAME --engine-location=$HOSTNAME
--client-port=10100 --engine-port=10101 -l=ipcontroller.log &
sleep 3
echo "starting ipengines..."
mpiexec -n $NSLOTS ipengine --mpi=mpi4py
wait
-------------------------------------------------------------

I was mislead by the ipcluster documentation which appears to imply that
ipcluster mpiexec -n $NSLOTS --mpi=mpi4py 
would work when ipengines and client run on different servers.

If you see the following error message:

Failure: twisted.internet.error.ConnectionRefusedError: Connection was
refused by other side: 111: Connection refused.

Check your furl files




kcsmith wrote:
> 
> I'm trying to run ipcluster under the sun grid engine on a 10 node cluster
> and I encountered the following error.
> 
> Only those ipengines which reside on the same node as ipcontroller
> connect.   The rest get CONNECTION REFUSED[111] errors.
> 
> I traced this problem down to the furl files that ipcontroller creates. 
> They only have the local host ip address listed.  
> pb://d2vqoq6l7tmjtdjl4gi2ctwlwbxzzdc2@127.0.0.1:56104/ei4yhcb5qqa3pyyoi32j3guqfkzqtd5q
> 
> If I manually add the actual ipcontroller node's ip address to the furl
> then everything works, ipengines connect and the client connects.
> 
> i.e.
> 
> pb://d2vqoq6l7tmjtdjl4gi2ctwlwbxzzdc2@10.0.255.234:56104/ei4yhcb5qqa3pyyoi32j3guqfkzqtd5q
> 
> When ipcontroller is started on 10.0.255.234
> 
> Is there some system setting or environment variable which can be set to
> force foolscap to include the ipcontroller node ip address?  Or is there
> something else wrong??
> 
> Thanks,
> Keith
> 

-- 
View this message in context: http://old.nabble.com/furls-only-have-localhost-as-ipcontroller-location-tp29271660p29278568.html
Sent from the IPython - User mailing list archive at Nabble.com.



More information about the IPython-User mailing list