[IPython-user] Ipython and multiprocessing

Brian Granger ellisonbg.net@gmail....
Wed Dec 17 19:21:00 CST 2008


> The huge deal with multiprocessing is that the spawning of new processes
> is done by forks under Unix. This is great for efficiency, but also it
> gives a nice "fealing" to, eg, the parallel loops, as globals are
> transparently distributed throught the fork. The shared memory
> abstraction is very nice too. As a result multiprocessing feels really
> nice for parallele computation on a multi CPU box.
> These two mechanisms would be a fantastic addition to IPython1, but I
> suspect that would be a lot of work, and I really don't have time to
> spend on this.

The great + of multiprocessing (its usage of fork to spawn everything)
is also its big -.  The fork based model of multiprocessing is
orthogonal to the type of interactive computing that IPython (and
python itself) offers.  It also means that the second you want to
scale beyond a multicore CPU, you have to go back to the drawing
board.  I think you *could* possibly hook up multiprocessing processes
across different hosts, but at that point, all the benefits go away
(true fast shared memory, globals, forking).

In terms of using IPython with multiprocessing, the core problem is
that there is no reasonable way of forking an interactive session.  To
really support interactive parallel computing, experience (our own as
well as commercial products like Star-P) has shown that you really
need an architecture that is more like what IPython offers.  It is
_very_ possible that we could provide multiprocessing-like APIs in
IPython and we are more than willing to work with folks to develop
those.

I would also love to be able to use multiprocessing within IPython to
spawn IPython engines on multicore machines, but unfortunately,
multiprocessing currently has trouble playing well with Twisted, which
we use extensively.

Cheers,

Brian

> By the way, I have my own implementation of shared numpy arrays, that I
> am willing to share if anybody is interested. It is a thin wrapper on the
> array object of multiprocessing. It works for me, and it would be useful
> to make it a common object people use with numpy and multiprocessing, but
> I really don't have time to polish it and consider all the use cases
> right now. Anybody wants to pick it up?



> Gaël
>


More information about the IPython-user mailing list