[IPython-User] Run code on engine before any parallel task

Andreas Schröder andreas@drqueue....
Tue Feb 21 16:06:16 CST 2012


Hello IPython list,

while working on DrQueueIPython, I have the problem that I need to group
engines together in 'pools' so that tasks for a special pool will only
run on its members.

The purpose of this is that I can have a big cluster with lots of
machines running for different users who need to have their tasks run
isolated from other users.

My current approach is to store (additional) information about
registered engines including pool membership in MongoDB and check these
information via a dependent function before running a task.

I have a wrapper script for starting 'ipengine' (see
https://github.com/kaazoo/DrQueueIPython/blob/master/bin/drqueue_slave.py).
The script takes the engine_id from the ipengine output and tries to run
code using DirectView on that engine_id directly after starting the process.

It works more or less but has several disadvantages:
 * There is some delay between registration of engine and storage of
logfile.
 * There seems to be some delay between registration seen by engine and
registration seen by ipcontroller.
 * It's not possible to make sure that the pool setup code is the first
code that runs on the engine.
 * The computer running the engine process needs to have access to
MongoDB which possibly runs on another machine. Not so good for security.


Do you have any idea how to monitor registration of new engines and
directly run code on them before other tasks are run?
Or do you even have a better suggestion?


Regards,
Andreas

-- 

Andreas Schröder | developer

DrQueue, the Open Source Distributed Render Queue
http://www.drqueue.org


More information about the IPython-User mailing list