[IPython-User] Is iPython useful for this scenario?

Florian Lindner mailinglists@xgm...
Thu Feb 9 15:59:20 CST 2012


2012/2/7 Brian Granger <ellisonbg@gmail.com>:
> Florian,
>
> On Tue, Feb 7, 2012 at 6:04 AM, Florian Lindner <mailinglists@xgm.de> wrote:
>> 2012/2/6 Brian Granger <ellisonbg@gmail.com>:
>>> On Sat, Feb 4, 2012 at 2:03 PM, Florian Lindner <mailinglists@xgm.de> wrote:
>>
>>>> I'm currently working on a control-/queue-management software for a
>>>> CFD simulation system. If consists of three parts:
>>>>
>>>> - A client communicates with the CFD system. It can be long running.
>>>> The client can also be run standalone.
>>>>
>>>> - A server which does the queue management and starts up the clients.
>>>> It is non-interactive
>>>>
>>>> - A server interface which the user uses to talk to the server, e.g.
>>>> to enqueue new jobs.
>>>
>>> I am following your design here, but the naming of things is a bit
>>> backwards from IPython.  Here is our terminology:
>>>
>>> * Engine = runs on a compute node and does the actual computation.
>>> This is where the CFD sim would run.
>>> * Controller = Schedules tasks to engines using lightweight, low
>>> latency scheduler.
>>> * Cluster = Starts Engines/Controller using batch system.
>>> * Client = Frontend process that the users uses to talk to the above.
>>>
>>>> Currently they are communicating via XMLRPC (from python stdlib):
>>>>
>>>> client <---- server <---- server interface.
>>>
>>> This architecture is partially reinventing everything in
>>> IPython.parallel.  I would just use IPython.parallel and take
>>> advantage of everything we have there.  It is extremely powerful and
>>> will out perform XMLRPC by a long shot.
>>>
>>>> A this time the system works only localhost and with one client.
>>>
>>> IPython supports a wide range of cluster configurations (PBS, Torque,
>>> mpiexec, SSH, etc.) and multiple engines and clients.
>>>
>>>> Before continue to extend it I wonder if iPython could be useful for
>>>> network communication and process management. I browsed through the
>>>> docs but I'm not entirely sure if I got the ideas of iPython right.
>>>>
>>>> The user should not get in contact with iPython. The software is not
>>>> doing and probably will never do any numerical demanding calculations
>>>> itself.
>>>>
>>>> Is iPython useful in the scenario?
>>>
>>> It would be extremely useful.  I would check out our cluster docs here:
>>>
>>> http://ipython.org/ipython-doc/dev/parallel/index.html
>>>
>>> The notebook would also be useful as well:
>>>
>>> http://ipython.org/ipython-doc/dev/interactive/htmlnotebook.html
>>
>> Ok, I'm still unsure about the ideas behind it all...
>>
>> Given I have a working SSH connection with passwordless authentication.
>>
>> Currently my engine is used like: flof.py case.conf. case.conf
>> contains the steps for pre-processing, solving and post-processing.
>> Data (= the case) is a directory structure and thus file based.
>> flof.py would be the engine.
>
> You would have to do some refactoring and make the code in flof.py a
> Python library that you can import and run.
>
>> - Code Distribution: Does iPython provides a method for code
>> distribution or does the code needs to copied manually before starting
>> it on a remote computing node? Is iPython starting the remote process?
>
> Yes for some cases it can do that, but in most cases you want to have
> the code installed on the compute nodes.
>
>> - Data Distribution: Does iPython provides a method for data
>> distribution (copy over the case to a remote node)? I think I read
>> that there is no data distribution method.
>
> Yes, definitely.

http://ipython.org/ipython-doc/stable/parallel/parallel_process.html says:
"SSH mode does not do any file movement, so you will need to
distribute configuration files manually." I need to copy over files (a
directory)

>> My server (controller) starts up the engines.
>>
>> - Can I tell the controller to start up a specific job on a specific machine?
>
> Yes, see the DirectView docs.

I've seen it and also played around with it though found no way to
launch an engine on a specific host. Didn't dive further into this.

>> The matter is that I don't need queue management, load balancing etc.
>> I'm looking for a tool that helps me controlling the remote jobs,
>> checking and exchanging information about their status and - if needed
>> - kill them. (a job is a process instance of flof.py)
>
> I strongly encourage you to read the docs I linked to above.  It will
> answer all of these questions and more.

I did read them and unfortunatly they are not.

Well I will try to split up my questions in following threads.

Regards,

Florian


More information about the IPython-User mailing list