[IPython-User] ipython parallel task depencencies

Peter Prettenhofer peter.prettenhofer@gmail....
Wed Sep 25 14:20:08 CDT 2013

Hi Min,

2013/9/25 MinRK <benjaminrk@gmail.com>

> [..]
> I think these probably can be expressed with the dependencies, but I need
> more information about the relationship between tasks. Specifically, what
> information needs to be conveyed from one task to the other, and what
> restriction is applied to the dependent task - must it run in the same
> location, or should it run anywhere, as long as it is after the earlier
> task completes?
> 1. what information do extract-transform tasks need from the fetch result?

The fetch task writes a file to a (distributed) file system;
extract-transform gets as input the file name (netcdf file) and an
identifier of the chunk within the file to process (there are k chunks per

2. how to the extract-transform tasks relate to their parent fetch tasks?
> Do they want to run on the same engine, or can they run anywhere, as long
> as fetch happened first?

Each extract-transform task relates to a single parent fetch task. It can
run anywhere as long as fetch happens first if I use a distributed file
system but given that locality is supported this would be my preferred

> 3. Barrier is easy - asyncresult.wait and/or client.wait([list, of,
> asyncresults])


> 4. what is m? Is it n * k (one load task per extract task)? Do the load
> tasks want to happen in the same place as the extract, or can they run
> anywhere, again as long as it is later? What information does load need
> from extract-transform?

No, m is not n * k but rather the number of aggregates I want to compute;
each load task takes as input (n / m) * k intermediate results (generated
by extract-transform) and aggregates them.
The inputs are three dimensional numpy arrays (roughly 500mb in size).

I've read the task dependence section in the user guide more carefully and
it seems I can express this computation easily in a DAG; even without the

Thanks for the help and the great project!


>> thanks,
>>  Peter
>> --
>> Peter Prettenhofer
>> _______________________________________________
>> IPython-User mailing list
>> IPython-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/ipython-user
> _______________________________________________
> IPython-User mailing list
> IPython-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-user

Peter Prettenhofer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/ipython-user/attachments/20130925/3cc5c729/attachment.html 

More information about the IPython-User mailing list