[IPython-User] Parallel question: Sending data directly between engines
Sun Jan 8 00:11:03 CST 2012
On Sat, Jan 7, 2012 at 9:54 PM, MinRK <email@example.com> wrote:
> The problem is that gathering everything to the client prior to calling
> reduce won't work for the cases where the whole parallel problem doesn't fit
> in memory on one node, so you have to do a *parallel* reduce, which can be
> implemented without requiring more than two parts of the problem on a node
> at any given time.
Ah, but I'd understood that though the whole problem would never fit
in one node at the same time, this was about reducing the parameter
vector, which does fit in all nodes (in fact, Olivier says that it
gets broadcast at init time). It's just that over time, it diverges
as each node computes parameter updates for the data it has, so the
whole vector needs to be recomputed as a weighted average of all the
vectors in all individual nodes. Now, if the number of nodes is so
large that having n_nod copies of the parameter vector exceeds the
client's memory, then the weighted average can be done in batches, by
requesting only data from one group of nodes at a time.
At least that's how I read the original description...
More information about the IPython-User