[IPython-User] Parallel question: Sending data directly between engines

Olivier Grisel olivier.grisel@ensta....
Sun Jan 8 14:30:18 CST 2012


2012/1/8 Brian Granger <ellisonbg@gmail.com>:
> Don't forget, you can always just use MPI/mpi4py with IPython, which
> has very efficient reduce/allreduce that use the spanning tree
> approach.

Indeed but the MPI abstraction might be too strong (by hiding too much
of the underlying computational runtime): in particular it might
prevent the algorithm implementer to leverage data-locality by
scheduling some tasks to be run where it knows the input data is
already located (on the hard-drive or in shared memory as memory
mapped file for instance) rather that shipping it over the network
over and over again.

I like the ability of the IPython engines and controller and client to
collect whatever metadata they want, pass them around using pyzmq and
make it possible to plug your own smart scheduler based on those
runtime metadata.

I have not checked mpi4py recently though. Last time I experimented
with mpi was more than 7 years ago so things might have changed.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel


More information about the IPython-User mailing list