[IPython-User] Parallel question: Sending data directly between engines

Brian Granger ellisonbg@gmail....
Sun Jan 8 14:39:37 CST 2012


On Sun, Jan 8, 2012 at 12:30 PM, Olivier Grisel
<olivier.grisel@ensta.org> wrote:
> 2012/1/8 Brian Granger <ellisonbg@gmail.com>:
>> Don't forget, you can always just use MPI/mpi4py with IPython, which
>> has very efficient reduce/allreduce that use the spanning tree
>> approach.
>
> Indeed but the MPI abstraction might be too strong (by hiding too much
> of the underlying computational runtime): in particular it might
> prevent the algorithm implementer to leverage data-locality by
> scheduling some tasks to be run where it knows the input data is
> already located (on the hard-drive or in shared memory as memory
> mapped file for instance) rather that shipping it over the network
> over and over again.
>
> I like the ability of the IPython engines and controller and client to
> collect whatever metadata they want, pass them around using pyzmq and
> make it possible to plug your own smart scheduler based on those
> runtime metadata.

Yes, IPython/pyzmq is definitely more flexible than MPI.

> I have not checked mpi4py recently though. Last time I experimented
> with mpi was more than 7 years ago so things might have changed.

It is truly an amazing package.  Some of the best code I know of.

Cheers,

Brian

> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel



-- 
Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com


More information about the IPython-User mailing list