[IPython-user] TaskClient reconnect

Vishal Vatsa vishal.vatsa@gmail....
Wed Dec 10 12:00:53 CST 2008


2008/12/10 Brian Granger <ellisonbg.net@gmail.com>:
> Vishal,
>
> Sorry I havn't gotten back to you.
>
> So far, our overall philosophy has been that if the controller dies,
> then everything is lost.  So, currently, once the controller dies:
>
> * the engines don't know how to reconnect to a new controller
> * the controller state is lost (the work queues)
> * the client will also be left in an unknown state.
>
> So, the fact that you are getting an error in the client is not
> surprising.  Some thoughts:
>
> * We do have plans on making the controller fault tolerant (by
> allowing restarts).  But  this will take a good amount of work on the
> controller itself as well as the reconnection logic in the engines and
> clients.  I can give you details of this if you are interested, but
> this will be very subtle work.
>
> * The controller shouldn't just die - it should only stop when you
> kill it.  Can't you couple the controller killing logic with the
> corresponding logic in the client?

What I have ended up doing is, using notifyOnDisconnect to handle a
disconnect and polling for a new connection every so often.

This way I can restart the ipcluster and and not have to worry about any
of my systems which want to talk to ipcontroller

> We should track down the Stale Broker error though.  Could you post a
> bug report about this on the IPython launchpad bug tracker?

Will do.

Regards,
-vishal


More information about the IPython-user mailing list