[IPython-user] Cannot start ipcluster

Brian Granger ellisonbg.net@gmail....
Sun Oct 18 17:14:05 CDT 2009


Looks like you have been making progress...some comments:

* Something quite odd is going on.  While it would be nice if you could get
2.4-2.7 speedup on a dual core
system, I don't think that result is real.  I am not sure why you are seeing
this, but it is *extremely* rare
to see a speedup greater than the number or cores.  It is possible, but I
don't think you problem has
any of the characteristics that would make it so.
* From your description of the problem, ipython should be giving you nearly
2x speedup, but it is quite
lower.

The combination of these things makes me think there is an aspect of all of
this we are not understanding yet.
I am suspecting that the method you are using to time your code is not
accurate.  I have seen this type of
thing before.  Can you time it using a more accurate approach?  Some thing
like:

from timeit import default_timer as clock

t1 = clock()
....
t2 = clock()

It is possible that IPython is slower than multiprocessing in this case, but
something else is going on here.

Cheers,

Brian

On Sun, Oct 18, 2009 at 2:01 PM, Gökhan Sever <gokhansever@gmail.com> wrote:

>
>
> On Sun, Oct 18, 2009 at 2:34 PM, Gökhan Sever <gokhansever@gmail.com>wrote:
>
>>
>> Moreeeeee speed-up :)
>>
>> Next step is to use multiprocessing module.
>>
>
> I did two tests since I was not sure which timing to believe:
>
> real    6m37.591s
> user    10m16.450s
> sys    0m4.808s
>
> real    7m22.209s
> user    11m21.296s
> sys    0m5.540s
>
> which in result I figured out real is what I want to see.  So the
> improvement with respect to original linear 18m 5s run is 2.4 to 2.7X
> speed-up in a Dual Core 2.5 Ghz laptop using Python's multiprocessing
> module, which is great only adding a few line of code and slightly modifying
> my original process_all wrapper script.
>
> Here is the code:
>
>
> #!/usr/bin/env python
>
> """
> Execute postprocessing_saudi script in parallel using multiprocessing
> module.
> """
>
> from multiprocessing import Pool
> from subprocess import call
> import os
>
>
> def find_sea_files():
>
>     file_list, path_list = [], []
>     init = os.getcwd()
>
>     for root, dirs, files in os.walk('.'):
>         dirs.sort()
>         for file in files:
>             if file.endswith('.sea'):
>                 file_list.append(file)
>                 os.chdir(root)
>                 path_list.append(os.getcwd())
>                 os.chdir(init)
>
>     return file_list, path_list
>
>
> def process_all(pf):
>     os.chdir(pf[0])
>     call(['postprocessing_saudi', pf[1]])
>
>
> if __name__ == '__main__':
>     pool = Pool(processes=2)              # start 2 worker processes
>     files, paths = find_sea_files()
>     pathfile = [[paths[i],files[i]] for i in range(len(files))]
>     pool.map(process_all, pathfile)
>
>
> The main difference is to change map call since Python's original map
> supports only one iterable argument. This approach also shows execution
> results on the terminal screen unlike IPython's. I am assuming like
> IPython's, multiprocessing module should be able to run on external nodes.
> Which means once I can set a few fast external machines I can perform a few
> more tests.
>
> --
> Gökhan
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/ipython-user/attachments/20091018/c429141d/attachment.html 


More information about the IPython-user mailing list