[IPython-User] troubleshooting ipcluster

Robert Nishihara robertnishihara@gmail....
Wed Jul 11 14:09:57 CDT 2012


There was no stdout, and the stderr (attached and copied below) looks
normal.

I started the controller and engines separately this time (with qsub),
using the two attached scripts. This procedure worked fine before the
upgrade.

I tried recreating the sge profile using the instructions from this thread <
http://python.6.n6.nabble.com/Getting-setup-on-a-remote-cluster-w-Sun-Grid-Engine-td1663090.html>,
and this had no effect.

stderr for controller

    2012-07-11 14:59:49,674.674 [IPControllerApp] Using existing profile
dir: u'/home/robert/.ipython/profile_sge'
    2012-07-11 14:59:49.797 [IPControllerApp] Hub listening on tcp://
0.0.0.0:52512 for registration.
    2012-07-11 14:59:49.799 [IPControllerApp] Hub using DB backend: 'NoDB'
    2012-07-11 14:59:50.079 [IPControllerApp] hub::created hub
    2012-07-11 14:59:50.085 [IPControllerApp] writing connection info to
/home/robert/.ipython/profile_sge/security/ipcontroller-client.json
    2012-07-11 14:59:50.103 [IPControllerApp] writing connection info to
/home/robert/.ipython/profile_sge/security/ipcontroller-engine.json
    2012-07-11 14:59:50.129 [IPControllerApp] task::using Python leastload
Task scheduler
    2012-07-11 14:59:50.135 [IPControllerApp] Heartmonitor started
    2012-07-11 14:59:50.166 [scheduler] Scheduler started [leastload]
    2012-07-11 14:59:50.195 [IPControllerApp] Creating pid file:
/home/robert/.ipython/profile_sge/pid/ipcontroller.pid

stderr for engines

    2012-07-11 14:59:58,438.438 [IPClusterEngines] Using existing profile
dir: u'/home/robert/.ipython/profile_sge'
    2012-07-11 14:59:58.470 [IPClusterEngines] IPython cluster: started
    2012-07-11 14:59:58.471 [IPClusterEngines] Starting engines with
[daemon=False]
    2012-07-11 14:59:58.471 [IPClusterEngines] Starting 5 Engines with
SGEEngineSetLauncher
    2012-07-11 14:59:58.559 [IPClusterEngines] Job submitted with job id:
'2092'
    2012-07-11 15:00:28.559 [IPClusterEngines] Engines appear to have
started successfully

-Robert

On Wed, Jul 11, 2012 at 10:42 AM, MinRK <benjaminrk@gmail.com> wrote:

> What is the stdout/err of the controller and engine jobs?
>
> On Wed, Jul 11, 2012 at 6:03 PM, Robert Nishihara
> <robertnishihara@gmail.com> wrote:
> > My cluster recently upgraded to IPython 0.13. Now, when I run
> >
> >     ipcluster start -n 3 --profile=sge
> >
> > the controller and engines get submitted to the queue, but the terminate
> > immediately after starting. However, the output looks normal
> >
> >     2012-07-11 11:56:27,531.531 [IPClusterStart] Using existing profile
> dir:
> > u'/home/robert/.ipython/profile_sge'
> >     2012-07-11 11:56:27.566 [IPClusterStart] Starting ipcluster with
> > [daemon=False]
> >     2012-07-11 11:56:27.570 [IPClusterStart] Creating pid file:
> > /home/robert/.ipython/profile_sge/pid/ipcluster.pid
> >     2012-07-11 11:56:27.573 [IPClusterStart] Starting Controller with
> > SGEControllerLauncher
> >     2012-07-11 11:56:27.723 [IPClusterStart] Job submitted with job id:
> > '2088'
> >     2012-07-11 11:56:28.568 [IPClusterStart] Starting 3 Engines with
> > SGEEngineSetLauncher
> >     2012-07-11 11:56:28.645 [IPClusterStart] Job submitted with job id:
> > '2089'
> >     2012-07-11 11:56:58.647 [IPClusterStart] Engines appear to have
> started
> > successfully
> >
> > Is there a good way to troubleshoot this? The --debug flag doesn't seem
> to
> > give me any useful information.
> >
> > -Robert
> >
> > _______________________________________________
> > IPython-User mailing list
> > IPython-User@scipy.org
> > http://mail.scipy.org/mailman/listinfo/ipython-user
> >
> _______________________________________________
> IPython-User mailing list
> IPython-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/ipython-user/attachments/20120711/b0ac15ff/attachment.html 
-------------- next part --------------
2012-07-11 14:59:58,438.438 [IPClusterEngines] Using existing profile dir: u'/home/robert/.ipython/profile_sge'
2012-07-11 14:59:58.470 [IPClusterEngines] IPython cluster: started
2012-07-11 14:59:58.471 [IPClusterEngines] Starting engines with [daemon=False]
2012-07-11 14:59:58.471 [IPClusterEngines] Starting 5 Engines with SGEEngineSetLauncher
2012-07-11 14:59:58.559 [IPClusterEngines] Job submitted with job id: '2092'
2012-07-11 15:00:28.559 [IPClusterEngines] Engines appear to have started successfully
-------------- next part --------------
2012-07-11 14:59:49,674.674 [IPControllerApp] Using existing profile dir: u'/home/robert/.ipython/profile_sge'
2012-07-11 14:59:49.797 [IPControllerApp] Hub listening on tcp://0.0.0.0:52512 for registration.
2012-07-11 14:59:49.799 [IPControllerApp] Hub using DB backend: 'NoDB'
2012-07-11 14:59:50.079 [IPControllerApp] hub::created hub
2012-07-11 14:59:50.085 [IPControllerApp] writing connection info to /home/robert/.ipython/profile_sge/security/ipcontroller-client.json
2012-07-11 14:59:50.103 [IPControllerApp] writing connection info to /home/robert/.ipython/profile_sge/security/ipcontroller-engine.json
2012-07-11 14:59:50.129 [IPControllerApp] task::using Python leastload Task scheduler
2012-07-11 14:59:50.135 [IPControllerApp] Heartmonitor started
2012-07-11 14:59:50.166 [scheduler] Scheduler started [leastload]
2012-07-11 14:59:50.195 [IPControllerApp] Creating pid file: /home/robert/.ipython/profile_sge/pid/ipcontroller.pid
-------------- next part --------------
A non-text attachment was scrubbed...
Name: start_controller.sh
Type: application/x-sh
Size: 204 bytes
Desc: not available
Url : http://mail.scipy.org/pipermail/ipython-user/attachments/20120711/b0ac15ff/attachment.sh 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: start_engines.sh
Type: application/x-sh
Size: 208 bytes
Desc: not available
Url : http://mail.scipy.org/pipermail/ipython-user/attachments/20120711/b0ac15ff/attachment-0001.sh 


More information about the IPython-User mailing list