[IPython-user] Cannot start ipcluster
Sun Oct 18 19:15:38 CDT 2009
2009/10/18 Brian Granger <email@example.com>
> Looks like you have been making progress...some comments:
> * Something quite odd is going on. While it would be nice if you could get
> 2.4-2.7 speedup on a dual core
> system, I don't think that result is real. I am not sure why you are
> seeing this, but it is *extremely* rare
> to see a speedup greater than the number or cores. It is possible, but I
> don't think you problem has
> any of the characteristics that would make it so.
> * From your description of the problem, ipython should be giving you nearly
> 2x speedup, but it is quite
> The combination of these things makes me think there is an aspect of all of
> this we are not understanding yet.
> I am suspecting that the method you are using to time your code is not
> accurate. I have seen this type of
> thing before. Can you time it using a more accurate approach? Some thing
> from timeit import default_timer as clock
> t1 = clock()
> t2 = clock()
> It is possible that IPython is slower than multiprocessing in this case,
> but something else is going on here.
Here are new benchmark results (in seconds) using your suggested timing
0-) Duration using the linear processing: 1048.07685399
1-) Duration using TaskClient and 2 Engines: 701.550107956
2-) Duration using MultiEngineClient and 2 Engines: 663.629260063
3-) I can't get timings using this method when I use multiprocessing module.
I will send my 4 scripts to your email for further investigations. So far,
the results don't seem much different than what were they in original.
> On Sun, Oct 18, 2009 at 2:01 PM, Gökhan Sever <firstname.lastname@example.org>wrote:
>> On Sun, Oct 18, 2009 at 2:34 PM, Gökhan Sever <email@example.com>wrote:
>>> Moreeeeee speed-up :)
>>> Next step is to use multiprocessing module.
>> I did two tests since I was not sure which timing to believe:
>> real 6m37.591s
>> user 10m16.450s
>> sys 0m4.808s
>> real 7m22.209s
>> user 11m21.296s
>> sys 0m5.540s
>> which in result I figured out real is what I want to see. So the
>> improvement with respect to original linear 18m 5s run is 2.4 to 2.7X
>> speed-up in a Dual Core 2.5 Ghz laptop using Python's multiprocessing
>> module, which is great only adding a few line of code and slightly modifying
>> my original process_all wrapper script.
>> Here is the code:
>> #!/usr/bin/env python
>> Execute postprocessing_saudi script in parallel using multiprocessing
>> from multiprocessing import Pool
>> from subprocess import call
>> import os
>> def find_sea_files():
>> file_list, path_list = , 
>> init = os.getcwd()
>> for root, dirs, files in os.walk('.'):
>> for file in files:
>> if file.endswith('.sea'):
>> return file_list, path_list
>> def process_all(pf):
>> call(['postprocessing_saudi', pf])
>> if __name__ == '__main__':
>> pool = Pool(processes=2) # start 2 worker processes
>> files, paths = find_sea_files()
>> pathfile = [[paths[i],files[i]] for i in range(len(files))]
>> pool.map(process_all, pathfile)
>> The main difference is to change map call since Python's original map
>> supports only one iterable argument. This approach also shows execution
>> results on the terminal screen unlike IPython's. I am assuming like
>> IPython's, multiprocessing module should be able to run on external nodes.
>> Which means once I can set a few fast external machines I can perform a few
>> more tests.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the IPython-user