<br><br><div class="gmail_quote">On Sun, Oct 18, 2009 at 12:27 PM, Brian Granger <span dir="ltr"><<a href="http://ellisonbg.net">ellisonbg.net</a>@<a href="http://gmail.com">gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Also, once you fix the import/name error, can you send me your script. That way I can see if there are any performance issues we can address.<br><br>Cheers,<br><font color="#888888"><br>Brian</font></blockquote><div><br>
<br>This is a slightly modified version of the original script that I used for timings. I just defined a generic function to find the sea files and extract paths so that I can pass into the main processing function to be mapped. Also attaching to file for easy access. Will test with MultiEngineClient approach as well.<br>
<br>#!/usr/bin/env python<br><br>"""<br>Execute postprocessing_saudi script in parallel using IPython's parallel<br>computing features. Make sure "ipcluster local n -2" has run in a separate<br>
shell window before running this script<br>"""<br><br># Gokhan Sever <<a href="mailto:gokhansever@gmail.com">gokhansever@gmail.com</a>><br># University of North Dakota - Dept. of Atmospheric Sciences<br>
# Written : October 18, 2009 with help from Brain Granger<br><br>from IPython.kernel.client import TaskClient<br>from subprocess import call<br>import os<br><br><br>def find_sea_files():<br><br> file_list, path_list = [], []<br>
init = os.getcwd()<br><br> for root, dirs, files in os.walk('.'):<br> dirs.sort()<br> for file in files:<br> if file.endswith('.sea'):<br> file_list.append(file)<br>
os.chdir(root)<br> path_list.append(os.getcwd())<br> os.chdir(init)<br><br> return file_list, path_list<br><br><br>def process_all(path, file):<br> import os<br> from subprocess import call<br>
os.chdir(path)<br> call(['postprocessing_saudi', file])<br><br><br>if __name__ == '__main__':<br> tc = TaskClient()<br> files, paths = find_sea_files()<br> tc.map(process_all, paths, files)<br>
<br> </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div><div></div><div class="h5"><br><br><div class="gmail_quote">On Sun, Oct 18, 2009 at 10:26 AM, Brian Granger <span dir="ltr"><<a href="http://ellisonbg.net" target="_blank">ellisonbg.net</a>@<a href="http://gmail.com" target="_blank">gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br><div class="gmail_quote"><div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="gmail_quote"><div>from IPython.kernel.client import TaskClient<br>
from subprocess import call<br>
<br>def process(file):<br> call(['postprocessing_saudi', file])<br><br></div></div></blockquote></div><div><br>This function (process) will be pickled and sent to the engines. But, the engines don't have call imported yet.<br>
Two ways of accomplishing this:<div><br><br>def process(file):<br> from subprocess import call<br> call(['postprocessing_saudi', file])<br><br></div>But, this has a small amount of overhead by importing call each time.<br>
<br>To import it once, do:<br><br>from IPython.kernel.client import MultiEngineClient<br>mec = MultiEngineClient()<br>mec.execute('from subprocessing import call')<br><br>before you submit the tasks. The MultiEngineClient interface doesn't do load balancing, just does things on<br>
the engines you specify (the default is all).<br><br>Give that a shot...<br><font color="#888888"><br>Brian<br> </font></div><div><div></div><div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="gmail_quote"><div>files = ['/home/gsever/Desktop/test/20090317_131342/PostProcessing/09_03_17_13_13_42.sea',<div><br> '/home/gsever/Desktop/test/20090318_075533/PostProcessing/09_03_18_07_55_33.sea',<br>
'/home/gsever/Desktop/test/20090319_110816/PostProcessing/09_03_19_11_08_16.sea',<br> '/home/gsever/Desktop/test/20090320_064651/PostProcessing/09_03_20_06_46_51.sea']<br><br></div>if __name__ == '__main__':<br>
tc = TaskClient()<br> tc.map(process, files)<br><br><br>Should I return something from the process function and why it is complaining about "NameError: global name 'call' is not defined" I am already importing the call on top. <br>
<br> </div><div><div></div><div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="gmail_quote"><div><font color="#888888"></font></div>
<div><div></div>
<div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="gmail_quote"><div> </div>
<div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="gmail_quote"><div> </div><div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="gmail_quote">
<div>Doing "ps -aux | grep ip" yields <br><br>gsever 15467 0.7 0.3 20700 14264 pts/2 S+ 00:02 0:00 /usr/bin/python /usr/bin/ipcluster local -n 4<br>gsever 15468 1.1 0.5 40424 23544 pts/2 S+ 00:02 0:00 /usr/bin/python /usr/bin/ipcontroller --logfile=/home/gsever/.ipython/log/ipcontroller<br>
gsever 15469 1.1 0.4 36980 20332 pts/2 S+ 00:02 0:00 /usr/bin/python /usr/bin/ipengine --logfile=/home/gsever/.ipython/log/ipengine15468-<br>gsever 15470 1.1 0.5 37396 20668 pts/2 S+ 00:02 0:00 /usr/bin/python /usr/bin/ipengine --logfile=/home/gsever/.ipython/log/ipengine15468-<br>
gsever 15471 0.9 0.4 37064 20392 pts/2 S+ 00:02 0:00 /usr/bin/python /usr/bin/ipengine --logfile=/home/gsever/.ipython/log/ipengine15468-<br>gsever 15472 1.2 0.4 36976 20328 pts/2 S+ 00:02 0:00 /usr/bin/python /usr/bin/ipengine --logfile=/home/gsever/.ipython/log/ipengine15468-<br>
<br>and system monitor shows this tasks as sleeping. See the screenshot : <a href="http://img148.imageshack.us/img148/6240/screenshotsystemmonitor.png" target="_blank">http://img148.imageshack.us/img148/6240/screenshotsystemmonitor.png</a><br>
<br></div></div></blockquote></div><div><br>It could show up as sleeping and still be working. <br></div></div></blockquote></div><div><br>They might be working, but they are not letting me give them extra work :)<br> </div>
<div><div></div><div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="gmail_quote"><div><font color="#888888"><br>Brian<br> </font></div><div><div></div><div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="gmail_quote"><div>I hope this is not another Fedora 11 related bug. <br> </div><div><div></div><div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="gmail_quote"><div>
Cheeers,<br><font color="#888888"><br>Brian<br> </font></div><div><div></div><div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="gmail_quote"><div><br>I[1]: from IPython.kernel.client import MultiEngineClient<div><br>
/usr/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-linux-i686.egg/twisted/python/filepath.py:12: DeprecationWarning: the sha module is deprecated; use the hashlib module instead<br> import sha<br>/usr/lib/python2.6/site-packages/foolscap-0.4.2-py2.6.egg/foolscap/banana.py:2: DeprecationWarning: the sets module is deprecated<br>
<br></div>I[2]: mec = MultiEngineClient()<br><br>---------------------------------------------------------------------------<br>ConnectionRefusedError Traceback (most recent call last)<br><br>/home/gsever/Desktop/<ipython console> in <module>()<br>
<br>/home/gsever/Desktop/python-repo/ipython/IPython/kernel/client.pyc in get_multiengine_client(furl_or_file)<br> 67 """<br> 68 client = blockingCallFromThread(_client_tub.get_multiengine_client, <br>
---> 69 furl_or_file)<br> 70 return client.adapt_to_blocking_client()<br> 71 <br><br>/home/gsever/Desktop/python-repo/ipython/IPython/kernel/twistedutil.pyc in blockingCallFromThread(f, *a, **kw)<br>
70 @raise: any error raised during the callback chain.<br> 71 """<br>---> 72 return twisted.internet.threads.blockingCallFromThread(reactor, f, *a, **kw)<br> 73 <br>
74 else:<br>
<br>/usr/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-linux-i686.egg/twisted/internet/threads.pyc in blockingCallFromThread(reactor, f, *a, **kw)<br> 112 result = queue.get()<br> 113 if isinstance(result, failure.Failure):<br>
--> 114 result.raiseException()<br> 115 return result<br> 116 <br><br>/usr/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-linux-i686.egg/twisted/python/failure.pyc in raiseException(self)<br> 324 information if available.<br>
325 """<br>--> 326 raise self.type, self.value, self.tb<br> 327 <br> 328 <br><br>ConnectionRefusedError: Connection was refused by other side: 111: Connection refused.<br><br><br>
I had installed Twisted and other parallel computing requirements following the instructions that were listed on the documentation pages.<br><br> </div><div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
mec.get_ids()<br><br>It may all be working...all you see are deprecation warnings related to Python 2.6.<br><br>One more word. The parallel computing stuff is not working in trunk right now (I am fixing it),<br>
so please stick with 0.10.<br></blockquote></div><div><br>I was bitten by the trunk a few times. For over a month I am using the downgraded (0.10) version :)<br> </div><div><div></div><div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<br>Cheers,<br><br>Brian<br><br><div class="gmail_quote"><div><div></div><div>On Sat, Oct 17, 2009 at 5:41 PM, Gökhan Sever <span dir="ltr"><<a href="mailto:gokhansever@gmail.com" target="_blank">gokhansever@gmail.com</a>></span> wrote:<br>
</div></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div><div></div><div>Hello,<br><br>I want to experiment IPython's parallel computing functionality. This far I couldn't progress much because ipcluster instantiation stalls giving the following messages without dropping me into the main IPython shell. <br>
<br>My intention is parallelise a small Python script that calls an external set of scripts that process the dataset I have in-hand. It is not a huge computing power demanding task but in my Intel 2.5Ghz Dual Core 2 it takes about 1.5 hours to process the whole dataset. Looking at the system monitor I see that the workload is not equally distributed in between CPUs (one of them usually much lazier than the other.) I am sure parallezing the code run would boost the processing speed. In my dataset I have 17 folders and each folder is independent from each other. My script visits each folder and calls the main external script via subprocess module's call function. Processing starts with the first folder, and doesn't work on the next folder unless the processing finishes with the previous folder. Basically, what I really want is to put externally called scripts into separate threads, so that I don't need to wait the previous job to be done during the processing process.<br>
<br>From the IPython parallel computing documentation, it seems like what I want is doable in IPython. However I need some advice whether my understanding is correct in this aspect. Also for the solution of the below warning messages. <br>
<br>Thanks.<br><br><br>[gsever@ccn Desktop]$ ipcluster local -n 4<br>/usr/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-linux-i686.egg/twisted/python/filepath.py:12: DeprecationWarning: the sha module is deprecated; use the hashlib module instead<br>
import sha<br>/usr/lib/python2.6/site-packages/foolscap-0.4.2-py2.6.egg/foolscap/banana.py:2: DeprecationWarning: the sets module is deprecated<br>2009-10-17 18:59:37-0500 [-] Log opened.<br>2009-10-17 18:59:37-0500 [-] Process ['ipcontroller', '--logfile=/home/gsever/.ipython/log/ipcontroller'] has started with pid=11066<br>
2009-10-17 18:59:37-0500 [-] Waiting for controller to finish starting...<br>2009-10-17 18:59:38-0500 [-] '/usr/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-linux-i686.egg/twisted/python/filepath.py:12: DeprecationWarning: the sha module is deprecated; use the hashlib module instead\n import sha\n'<br>
2009-10-17 18:59:38-0500 [-] '/usr/lib/python2.6/site-packages/foolscap-0.4.2-py2.6.egg/foolscap/banana.py:2: DeprecationWarning: the sets module is deprecated\n'<br>2009-10-17 18:59:39-0500 [-] Controller started<br>
2009-10-17 18:59:39-0500 [-] Process ['ipengine', '--logfile=/home/gsever/.ipython/log/ipengine11066-'] has started with pid=11067<br>2009-10-17 18:59:39-0500 [-] Process ['ipengine', '--logfile=/home/gsever/.ipython/log/ipengine11066-'] has started with pid=11068<br>
2009-10-17 18:59:39-0500 [-] Process ['ipengine', '--logfile=/home/gsever/.ipython/log/ipengine11066-'] has started with pid=11069<br>2009-10-17 18:59:39-0500 [-] Process ['ipengine', '--logfile=/home/gsever/.ipython/log/ipengine11066-'] has started with pid=11070<br>
2009-10-17 18:59:39-0500 [-] Engines started with pids: [11067, 11068, 11069, 11070]<br>2009-10-17 18:59:39-0500 [-] '/usr/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-linux-i686.egg/twisted/python/filepath.py:12: DeprecationWarning: the sha module is deprecated; use the hashlib module instead\n import sha\n'<br>
2009-10-17 18:59:39-0500 [-] '/usr/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-linux-i686.egg/twisted/python/filepath.py:12: DeprecationWarning: the sha module is deprecated; use the hashlib module instead\n import sha\n'<br>
2009-10-17 18:59:39-0500 [-] '/usr/lib/python2.6/site-packages/foolscap-0.4.2-py2.6.egg/foolscap/banana.py:2: DeprecationWarning: the sets module is deprecated\n'<br>2009-10-17 18:59:40-0500 [-] '/usr/lib/python2.6/site-packages/foolscap-0.4.2-py2.6.egg/foolscap/banana.py:2: DeprecationWarning: the sets module is deprecated\n'<br>
2009-10-17 18:59:40-0500 [-] '/usr/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-linux-i686.egg/twisted/python/filepath.py:12: DeprecationWarning: the sha module is deprecated; use the hashlib module instead\n import sha\n'<br>
2009-10-17 18:59:40-0500 [-] '/usr/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-linux-i686.egg/twisted/python/filepath.py:12: DeprecationWarning: the sha module is deprecated; use the hashlib module instead\n import sha\n'<br>
2009-10-17 18:59:40-0500 [-] '/usr/lib/python2.6/site-packages/foolscap-0.4.2-py2.6.egg/foolscap/banana.py:2: DeprecationWarning: the sets module is deprecated\n'<br>2009-10-17 18:59:40-0500 [-] '/usr/lib/python2.6/site-packages/foolscap-0.4.2-py2.6.egg/foolscap/banana.py:2: DeprecationWarning: the sets module is deprecated\n'<br>
<br clear="all"><br>Here is my system info:<br>================================================================================<br>Platform : Linux-2.6.29.6-217.2.3.fc11.i686.PAE-i686-with-fedora-11-Leonidas<br>Python : ('CPython', 'tags/r26', '66714')<br>
IPython : 0.10<br>NumPy : 1.4.0.dev<br>================================================================================<br><br>-- <br><font color="#888888">Gökhan<br>
</font><br></div></div>_______________________________________________<br>
IPython-user mailing list<br>
<a href="mailto:IPython-user@scipy.org" target="_blank">IPython-user@scipy.org</a><br>
<a href="http://mail.scipy.org/mailman/listinfo/ipython-user" target="_blank">http://mail.scipy.org/mailman/listinfo/ipython-user</a><br>
<br></blockquote></div><br>
</blockquote></div></div></div><br><br clear="all"><br>-- <br><font color="#888888">Gökhan<br>
</font></blockquote></div></div></div><br>
</blockquote></div></div></div><br><br clear="all"><br>-- <br><font color="#888888">Gökhan<br>
</font></blockquote></div></div></div><br>
</blockquote></div></div></div><br><br clear="all"><br>-- <br><font color="#888888">Gökhan<br>
</font></blockquote></div></div></div><br>
</blockquote></div></div></div><br><br clear="all"><br>-- <br><font color="#888888">Gökhan<br>
</font></blockquote></div></div></div><br>
</blockquote></div><br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>Gökhan<br>