[IPython-User] Is iPython useful for this scenario?

MinRK benjaminrk@gmail....
Thu Feb 9 16:18:57 CST 2012


IPython is not well suited for remote process management.  In fact, that's
what IPython is worst at.  Where IPython is helpful is handing coordination
and communication among processes *once they have been started*.  If you
want a tool to help starting processes, better choices include celery,
salt, puppet, etc.

-MinRK

On Thu, Feb 9, 2012 at 13:59, Florian Lindner <mailinglists@xgm.de> wrote:

> 2012/2/7 Brian Granger <ellisonbg@gmail.com>:
> > Florian,
> >
> > On Tue, Feb 7, 2012 at 6:04 AM, Florian Lindner <mailinglists@xgm.de>
> wrote:
> >> 2012/2/6 Brian Granger <ellisonbg@gmail.com>:
> >>> On Sat, Feb 4, 2012 at 2:03 PM, Florian Lindner <mailinglists@xgm.de>
> wrote:
> >>
> >>>> I'm currently working on a control-/queue-management software for a
> >>>> CFD simulation system. If consists of three parts:
> >>>>
> >>>> - A client communicates with the CFD system. It can be long running.
> >>>> The client can also be run standalone.
> >>>>
> >>>> - A server which does the queue management and starts up the clients.
> >>>> It is non-interactive
> >>>>
> >>>> - A server interface which the user uses to talk to the server, e.g.
> >>>> to enqueue new jobs.
> >>>
> >>> I am following your design here, but the naming of things is a bit
> >>> backwards from IPython.  Here is our terminology:
> >>>
> >>> * Engine = runs on a compute node and does the actual computation.
> >>> This is where the CFD sim would run.
> >>> * Controller = Schedules tasks to engines using lightweight, low
> >>> latency scheduler.
> >>> * Cluster = Starts Engines/Controller using batch system.
> >>> * Client = Frontend process that the users uses to talk to the above.
> >>>
> >>>> Currently they are communicating via XMLRPC (from python stdlib):
> >>>>
> >>>> client <---- server <---- server interface.
> >>>
> >>> This architecture is partially reinventing everything in
> >>> IPython.parallel.  I would just use IPython.parallel and take
> >>> advantage of everything we have there.  It is extremely powerful and
> >>> will out perform XMLRPC by a long shot.
> >>>
> >>>> A this time the system works only localhost and with one client.
> >>>
> >>> IPython supports a wide range of cluster configurations (PBS, Torque,
> >>> mpiexec, SSH, etc.) and multiple engines and clients.
> >>>
> >>>> Before continue to extend it I wonder if iPython could be useful for
> >>>> network communication and process management. I browsed through the
> >>>> docs but I'm not entirely sure if I got the ideas of iPython right.
> >>>>
> >>>> The user should not get in contact with iPython. The software is not
> >>>> doing and probably will never do any numerical demanding calculations
> >>>> itself.
> >>>>
> >>>> Is iPython useful in the scenario?
> >>>
> >>> It would be extremely useful.  I would check out our cluster docs here:
> >>>
> >>> http://ipython.org/ipython-doc/dev/parallel/index.html
> >>>
> >>> The notebook would also be useful as well:
> >>>
> >>> http://ipython.org/ipython-doc/dev/interactive/htmlnotebook.html
> >>
> >> Ok, I'm still unsure about the ideas behind it all...
> >>
> >> Given I have a working SSH connection with passwordless authentication.
> >>
> >> Currently my engine is used like: flof.py case.conf. case.conf
> >> contains the steps for pre-processing, solving and post-processing.
> >> Data (= the case) is a directory structure and thus file based.
> >> flof.py would be the engine.
> >
> > You would have to do some refactoring and make the code in flof.py a
> > Python library that you can import and run.
> >
> >> - Code Distribution: Does iPython provides a method for code
> >> distribution or does the code needs to copied manually before starting
> >> it on a remote computing node? Is iPython starting the remote process?
> >
> > Yes for some cases it can do that, but in most cases you want to have
> > the code installed on the compute nodes.
> >
> >> - Data Distribution: Does iPython provides a method for data
> >> distribution (copy over the case to a remote node)? I think I read
> >> that there is no data distribution method.
> >
> > Yes, definitely.
>
> http://ipython.org/ipython-doc/stable/parallel/parallel_process.html says:
> "SSH mode does not do any file movement, so you will need to
> distribute configuration files manually." I need to copy over files (a
> directory)
>
> >> My server (controller) starts up the engines.
> >>
> >> - Can I tell the controller to start up a specific job on a specific
> machine?
> >
> > Yes, see the DirectView docs.
>
> I've seen it and also played around with it though found no way to
> launch an engine on a specific host. Didn't dive further into this.
>
> >> The matter is that I don't need queue management, load balancing etc.
> >> I'm looking for a tool that helps me controlling the remote jobs,
> >> checking and exchanging information about their status and - if needed
> >> - kill them. (a job is a process instance of flof.py)
> >
> > I strongly encourage you to read the docs I linked to above.  It will
> > answer all of these questions and more.
>
> I did read them and unfortunatly they are not.
>
> Well I will try to split up my questions in following threads.
>
> Regards,
>
> Florian
> _______________________________________________
> IPython-User mailing list
> IPython-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/ipython-user/attachments/20120209/36c3e06b/attachment.html 


More information about the IPython-User mailing list