[IPython-dev] parallel job arrays
Tue Feb 4 04:12:19 CST 2014
I'm just starting to use IPython parallel with our SGE cluster, it's a
wonderful piece of work, thank you.
I'm using a load balanced view over a cluster to do asynchronous map
operations. One map operation may be over a large number of tasks, and may
take a while (hours-days). I'd like to be able to setup and launch the job
(from an IPython notebook) then check later if everything has finished and
retrieve the results (possibly from a different computer). I've configured
the controller to use the SQLite DB and have figured out how to retrieve
results from the hub given a list of msg ids. All good so far.
If I understand right, in order to be able to fetch the results later from
a different client session, I need to keep a record of all the msg ids from
the map job I launched earlier, right? I guess the simplest thing to do is
write the msg ids to a file, then read again later? Is this what other
folks do, or am I missing something?
I'm used to using job arrays in SGE, and when you launch a job array you
get a single job ID (e.g., 1234567) representing the whole job array, as
well as job IDs for each of the tasks (e.g., 1234567.1, 1234567.2 etc.).
It's just a thought, but if IPython parallel had some sort of similar
notion (i.e., when you submit a parallel map, you get a msg id for the
whole map operation as well as msg ids for each of the individual tasks),
then that might support some small conveniences. For example, you could
call client.get_result() with just the msg id for the whole map operation
and get back something like an AsyncMapResult. You could also call
client.abort() with the same msg id to cancel the whole map.
Btw when I passed a list of msg ids to client.get_result() then called
wait_interactive() on the returned AsyncHubResult, it showed '8/8 tasks
finished' right from the start, even though tasks were still
pending/running. I'm on IPython 1.1.0.
Thanks again for the great work,
Head of Epidemiological Informatics
Centre for Genomics and Global Health <http://cggh.org>
The Wellcome Trust Centre for Human Genetics
Tel: +44 (0)1865 287721 ***new number***
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the IPython-dev