[IPython-user] TaskClient reconnect
Wed Dec 10 12:00:53 CST 2008
2008/12/10 Brian Granger <firstname.lastname@example.org>:
> Sorry I havn't gotten back to you.
> So far, our overall philosophy has been that if the controller dies,
> then everything is lost. So, currently, once the controller dies:
> * the engines don't know how to reconnect to a new controller
> * the controller state is lost (the work queues)
> * the client will also be left in an unknown state.
> So, the fact that you are getting an error in the client is not
> surprising. Some thoughts:
> * We do have plans on making the controller fault tolerant (by
> allowing restarts). But this will take a good amount of work on the
> controller itself as well as the reconnection logic in the engines and
> clients. I can give you details of this if you are interested, but
> this will be very subtle work.
> * The controller shouldn't just die - it should only stop when you
> kill it. Can't you couple the controller killing logic with the
> corresponding logic in the client?
What I have ended up doing is, using notifyOnDisconnect to handle a
disconnect and polling for a new connection every so often.
This way I can restart the ipcluster and and not have to worry about any
of my systems which want to talk to ipcontroller
> We should track down the Stale Broker error though. Could you post a
> bug report about this on the IPython launchpad bug tracker?
More information about the IPython-user