[IPython-User] Parallel question: Sending data directly between engines
Sat Jan 7 23:54:55 CST 2012
On Sat, Jan 7, 2012 at 21:37, Fernando Perez <firstname.lastname@example.org> wrote:
> Hi Olivier,
> On Fri, Jan 6, 2012 at 9:58 PM, Olivier Grisel <email@example.com>
> > Very interesting thread, thanks for sharing. Let me share some of my
> > related use cases with to feed the IPython.parallel community with
> > some additional use cases you might not be aware of.
> Thanks a lot for this explanation and the link. I think I must have
> missed something, because it seems to me that we already offer that:
> In : from IPython.parallel import Client
> ...: v = Client()[:]
> In : x = rand(len(v.targets))
> ...: v.scatter('x', x)
> Out: <AsyncResult: scatter>
> In : v['x']
> [array([ 0.55751144]),
> array([ 0.51093798]),
> array([ 0.03574132]),
> array([ 0.31362035])]
> In : v['x_sum'] = sum(v['x'])
> ...: v['x_sum']
> No? That last line performs what the all_reduce operation is defined
> as: it pulls 'x' from each node, does the reduction, and sends back
> the global reduction to all nodes now named x_sum.
> Can you clarify what I'm not understanding?
The problem is that gathering everything to the client prior to calling
reduce won't work for the cases where the whole parallel problem doesn't
fit in memory on one node, so you have to do a *parallel* reduce, which can
be implemented without requiring more than two parts of the problem on a
node at any given time.
Thanks for this data point, I have been thinking about what a
'zeromq-style' parallel reduce might look like. Obviously, you can
implement one with all:all addressed messages (ROUTERs everywhere), but it
would be nice to let 0MQ do more of the legwork. And the difference
between reduce and allreduce is trivial, if I understand correctly - just
publish the final reduced result.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the IPython-User