[IPython-User] Parallel direct view pull and bandwidth

RICHARD Georges-Emmanuel perspective.electronic@gmail....
Mon Aug 8 03:25:34 CDT 2011


Minrk,
     Thanks for the hint,
indeed numpy array did the trick for bandwidth unleashing, I confirm 
with a numpy array of 62500 float 64
in case of machine"A" has controller and has 2 ipengines
pull FOO from machine"A" 2 engines 8.27ms   (2*500kB/8.27.10e-3 => 120MB/s)
pull FOO from machine"A" 1 engine  7.26ms    (500kB/2.7 => 68MB/s)  
looks much more better

and with a numpy array of 500000 float 64
pull FOO from machine"A" 2 engines 44 ms    (~181 MB/s)
pull FOO from machine"A" 1 engine  23 ms    (~173 MB/s)

else I tried to switch to 'pickle' but doesn't change things. I 
configure on the command-line:
on a local machine setup with 4 xterm console:
ipcontroller --ip='*' --Session.packer='pickle'
ipengine --Session.packer='pickle'
ipengine --Session.packer='pickle'
ipython --Session.packer='pickle'

then from ipython:

from IPython.parallel import Client
rc = Client(packer='pickle')
dview=rc[:]
dview.execute("FOO=[0.0 for i in xrange(62500)",block=True)  #  62500 * 
float 64 -> 500kB of data to transfer
[None,None]
T=time.time();tmp=dview.pull('FOO');print time.time() - T    # for 2 
ipengines
3.4 s
T=time.time();tmp=dview.pull('FOO',0);print time.time() - T # for 1 ipengine
2.5 s

I will continue to play arround, lots of things to learn anyway.

Thanks again,
     Joe


On 08/08/2011 11:31, MinRK wrote:
> These are indeed interesting numbers, thanks for running them!  The 
> one thing to be careful of is that IPython uses JSON to serialize by 
> default.  If you are on Python 2.6, the stdlib json is *extremely* 
> slow.  IPython will prefer jsonlib/jsonlib2 or more recent simplejson 
> if you have them, all of which are significantly faster.  If you are 
> concerned with serialization performance, you can specify a different 
> serialization scheme, such as cPickle, which you can activate with:
>
> c.Session.packer='pickle' in your config files or on the command-line 
> (you will then also have to specify `rc = Client(packer='pickle') when 
> you create your Client`
>
> When I am checking performance limits, I tend to use msgpack 
> <http://msgpack.org/>, which I enable with:
>
> c.Session.packer='msgpack.packb'
> c.Session.unpacker='msgpack.unpackb'
>
> in my config files.
>
> (again, now you have to do: `rc = Client(packer='msgpack.packb', 
> upacker='msgpack.unpackb')`  I will get the config properly hooked up 
> to the Client soon, so it will inherit correctly from the controller)
>
> Serializing a list of ints is more a test of the message serialization 
> scheme than IPython's throughput, because pretty much the whole time 
> will be spent making a giant JSON list (pickle should be much faster). 
>  If you really want to test the raw throughput of IPython+ØMQ, you 
> should try sending numpy arrays, which are supported with zero-copy 
> sends, that allow us to reach ~Gb limits on unimpressive laptops:
>
> rc = Client()
> dview = rc[:]
> with dview.sync_imports():
>     import numpy
> dview.execute("foo=numpy.random.random(62500)", block=True) # 8-byte 
> floats
> %time dview.pull('foo', block=True);
>
> -MinRK
>
> On Sun, Aug 7, 2011 at 19:38, RICHARD Georges-Emmanuel 
> <perspective.electronic@gmail.com 
> <mailto:perspective.electronic@gmail.com>> wrote:
>
>     Hi Minrk,
>
>     first of all congratulation to all the ipython team for the great work
>     you did with the release 0.11, and ZMQ 2.1.7. I'm a fan.
>
>     I tried the parallel with direct views, that's great.
>
>     with a machine A (192.168.1.4)
>     1)    ipcontroller --ip='*'
>     from machine"A" I remote start ipengine on machine"B" (192.168.1.200)
>     2)    ssh root@192.168.1.200 <mailto:root@192.168.1.200> ipengine
>     --file=/sharedMachineAfs/root/.config/ipython/profile_default/security/ipcontroller-engine.json
>     &
>     (I do the point 2)  twice to get 2 ipengines, I also tried in
>     local with
>     only machine"A")
>
>     then I start ipython to start a client, and I want to evaluate the
>     bandwith (and latency in a second step).
>
>     import time
>     from IPython.parallel import Client
>     rc = Client()
>     dview=rc[:]
>     dview.execute("FOO=[0.0 for i in xrange(62500)",block=True)  #
>      62500 *
>     float 64 -> 500kB of data to transfer
>     [None,None]
>     T=time.time();tmp=dview.pull('FOO');print time.time() - T    # for 2
>     ipengines
>     T=time.time();tmp=dview.pull('FOO',0);print time.time() - T # for 1
>     ipengine
>
>     in case of machine"A" as controller and machine"B" as 2 ipengines
>     pull FOO from machine"B" 2 engines 9.03 seconds      
>      (2*500kB/9.03 =>
>     110kB/s)     on a network 100Mb/s (12.MB/s)
>     pull FOO from machine"B" 1 engine 4.7     seconds        (500kB/4.7 =>
>     106kB/s)
>
>     in case of machine"A" as controller and as 2 ipengines
>     pull FOO from machine"A" 2 engines 3.4   seconds      
>      (2*500kB/3.4 =>
>     294kB/s) on a local machine
>     pull FOO from machine"A" 1 engine 2.7     seconds        (500kB/2.7 =>
>     185kB/s)
>
>     I guess I'm doing something wrong, or I missuse something. Any hint
>     would be appreciate, anyway I will continue to dig in.
>
>     Machine"A" and "B" are running under RHEL5 flavoured distro, with
>     python
>     2.6, ipython 0.11 installed from source.
>     Machine"A" is a Quad core 2.6GHz
>     Machine"B" is an AMD64 3000+  1.8GHz (pretty old but still alive)
>
>     cheers.
>                 Joe
>
>
>
>     --
>     RICHARD Georges-Emmanuel
>     CEO - Electronic and Computer Engineer
>     perspective.electronic@gmail.com
>     <mailto:perspective.electronic@gmail.com>
>     遠大電子有限公司 (統一編號24470425)
>     手機 +886930319433 <tel:%2B886930319433>
>     電話 +88635735463 <tel:%2B88635735463>
>
>     _______________________________________________
>     IPython-User mailing list
>     IPython-User@scipy.org <mailto:IPython-User@scipy.org>
>     http://mail.scipy.org/mailman/listinfo/ipython-user
>
>


-- 
RICHARD Georges-Emmanuel
CEO - Electronic and Computer Engineer
perspective.electronic@gmail.com
遠大電子有限公司 (統一編號24470425)
手機 +886930319433
電話 +88635735463

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/ipython-user/attachments/20110808/9d88e544/attachment.html 


More information about the IPython-User mailing list