[IPython-User] [IPython-user] furls only have localhost as ipcontroller location
kcsmith
kcsmith@raytheon....
Tue Jul 27 12:06:28 CDT 2010
I got it to work by NOT using ipcluster
For those who care, here's the script I submit via qsub to the Sun Gird
Engine on an 80 core, 10 compute node Linux Rocks cluster
-----------------------------------------------------------
#!/bin/bash
#$ -cwd
#$ -pe Common 40
#$ -j y
#$ -S /bin/bash
echo "Starting..."
echo $HOSTNAME
echo $NSLOTS
# Note: Some of the following may not be needed
export TMP=/tmp
export TMPDIR=/tmp
export MPI_DIR=/opt/openmpi/
PATH=$PATH:/opt/openmpi/bin:/share/apps/bin:/share/apps/lib
export PATH
# Note: Sun Grid Engine will pick a compute node to run this on (i.e. NOT
the head node under Rocks)
ipcontroller -r --client-location=$HOSTNAME --engine-location=$HOSTNAME
--client-port=10100 --engine-port=10101 -l=ipcontroller.log &
sleep 3
echo "starting ipengines..."
mpiexec -n $NSLOTS ipengine --mpi=mpi4py
wait
-------------------------------------------------------------
I was mislead by the ipcluster documentation which appears to imply that
ipcluster mpiexec -n $NSLOTS --mpi=mpi4py
would work when ipengines and client run on different servers.
If you see the following error message:
Failure: twisted.internet.error.ConnectionRefusedError: Connection was
refused by other side: 111: Connection refused.
Check your furl files
kcsmith wrote:
>
> I'm trying to run ipcluster under the sun grid engine on a 10 node cluster
> and I encountered the following error.
>
> Only those ipengines which reside on the same node as ipcontroller
> connect. The rest get CONNECTION REFUSED[111] errors.
>
> I traced this problem down to the furl files that ipcontroller creates.
> They only have the local host ip address listed.
> pb://d2vqoq6l7tmjtdjl4gi2ctwlwbxzzdc2@127.0.0.1:56104/ei4yhcb5qqa3pyyoi32j3guqfkzqtd5q
>
> If I manually add the actual ipcontroller node's ip address to the furl
> then everything works, ipengines connect and the client connects.
>
> i.e.
>
> pb://d2vqoq6l7tmjtdjl4gi2ctwlwbxzzdc2@10.0.255.234:56104/ei4yhcb5qqa3pyyoi32j3guqfkzqtd5q
>
> When ipcontroller is started on 10.0.255.234
>
> Is there some system setting or environment variable which can be set to
> force foolscap to include the ipcontroller node ip address? Or is there
> something else wrong??
>
> Thanks,
> Keith
>
--
View this message in context: http://old.nabble.com/furls-only-have-localhost-as-ipcontroller-location-tp29271660p29278568.html
Sent from the IPython - User mailing list archive at Nabble.com.
More information about the IPython-User
mailing list