[IPython-User] IPython.parallel over Infiniband?

MinRK benjaminrk@gmail....
Fri Sep 7 15:12:26 CDT 2012


This is probbaly a question for zeromq-dev, for how one might use
infiniband with zeromq.

There are benchmarks for zeromq tcp-over-infiniband, so presumably it is
possible, though it may require some flags to be set when building libzmq
itself (I have no idea).

Once you know how to use zmq+tcp over ib, then there shouldn't be anything
IPython needs to be aware of.

It's also possible that it's as simple as specifying a particular IP
(again, I have no idea).  If the infiniband interconnect refers to a
particular interface on the node, then it should simply be a matter of
passing `ipcontroller --ip=<1.2.3.4>`.

On Fri, Sep 7, 2012 at 4:21 AM, Jon Olav Vik <jonovik@gmail.com> wrote:

> Short version: Can the IPython.parallel ipcontroller and ipengines use
> Infiniband for communication?
>
> Background:
>
> As mentioned in p.revious posts, I use IPython.parallel on a shared batch
> cluster, where I submit ipengines as relatively short "batch jobs" for use
> by a
> Client.load_balanced_view(retries=..., chunksize=..., ordered=False). This
> gives me load-balanced, fault-tolerant (in particular if an engine job
> times
> out) computing of otherwise trivially parallel tasks. This is by far the
> most
> maintainable framework I've found, and it scales well to at least 100
> processors, or > 600 if I use several clusters. The limiting factor seems
> to be
> the number and latency of TCP connections.
>
> I recently got kicked out from a batch cluster for failing to utilize their
> precious Infiniband, and for *possibly* competing with the batch system's
> use
> of TCP. (This was not further investigated, as that cluster was intended to
> fill other needs than my rather-trivially-parallel computing, and so they
> didn't really want me around anyway.)
>
> Now, I know next to nothing about what Infiniband is, but googling
> suggested
> that TCP can be run over Infiniband.
>
> http://pkg-ofed.alioth.debian.org/howto/infiniband-howto-5.html
>
> I wonder if that could improve the latency of IPython.parallel tasks, while
> letting me be less of a nuisance to the batch cluster admins. Any hints on
> whether and how this can be achieved would be most appreciated.
>
> (I have mostly heard about Infiniband in connection with MPI. However, MPI
> doesn't seem to fit my needs because 1) all MPI processes need to start and
> stop at the same time, whereas I wish to use as many processors as happen
> to be
> available, without specifying the number in advance, 2) the ipcluster
> cannot
> use MPI for coordination, and 3) I wish to distribute tasks and results
> using a
> load_balanced_view() and not explicitly over MPI.)
>
> Best regards,
> Jon Olav
>
> _______________________________________________
> IPython-User mailing list
> IPython-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/ipython-user/attachments/20120907/2073b869/attachment-0001.html 


More information about the IPython-User mailing list