[SciPy-User] Speeding things up - how to use more than one computer core
Troels Emtekær Linnet
tlinnet@gmail....
Sat Apr 6 17:17:59 CDT 2013
And the winner was joblib :-)
Method was normal
Done :0:00:00.291000
[49990.0, 49991.0, 49992.0, 49993.0, 49994.0, 49995.0, 49996.0, 49997.0,
49998.0, 49999.0] <type 'numpy.float64'>
Method was Pool
Done :0:00:01.558000
[49990.0, 49991.0, 49992.0, 49993.0, 49994.0, 49995.0, 49996.0, 49997.0,
49998.0, 49999.0] <type 'numpy.float64'>
Method was joblib
Done :0:00:00.003000
[49990, 49991, 49992, 49993, 49994, 49995, 49996, 49997, 49998, 49999]
<type 'int'>
Method was joblib delayed
Done :0:00:00
[49990, 49991, 49992, 49993, 49994, 49995, 49996, 49997, 49998, 49999]
<type 'int'>
--------------------------------------
import multiprocessing
from datetime import datetime
from joblib import Parallel, delayed
def getsqrt(n):
res = sqrt(n**2)
return(res)
def main():
jobs = multiprocessing.cpu_count()-1
a = range(50000)
for method in ['normal','Pool','joblib','joblib delayed']:
startTime = datetime.now()
sprint=True
if method=='normal':
res = []
for i in a:
b = getsqrt(i)
res.append(b)
elif method=='Pool':
pool = Pool(processes=jobs)
res = pool.map(getsqrt, a)
elif method=='joblib':
Parallel(n_jobs=jobs)
func,res = (getsqrt, a)
elif method=='joblib delayed':
Parallel(n_jobs=-2) #Can also use '-1' for all cores, '-2' for
all cores=-1
func,res = delayed(getsqrt), a
else:
sprint=False
if sprint:
print "Method was %s"%method
print "Done :%s"%(datetime.now()-startTime)
print res[-10:], type(res[-1])
return(res)
if __name__ == "__main__":
res = main()
Troels Emtekær Linnet
2013/4/6 Ralf Gommers <ralf.gommers@gmail.com>
>
>
>
> On Sat, Apr 6, 2013 at 5:40 PM, Troels Emtekær Linnet <tlinnet@gmail.com>wrote:
>
>> Dear Scipy users.
>>
>> I am doing analysis of some NMR data, where I repeatability are doing
>> leastsq fitting.
>> But I get a little impatient for the time-consumption. For a run of my
>> data, it takes
>> approx 3-5 min, but it in this testing phase, it is to slow.
>>
>> A look in my task manager, show that I only consume 25%=1 core on my
>> computer.
>> And I have access to a computer with 24 cores, so I would like to speed
>> things up.
>> ------------------------------------------------
>> I have been looking at the descriptions of multithreading/Multiprocess
>> http://www.scipy.org/Cookbook/Multithreading
>> http://stackoverflow.com/questions/4598339/parallelism-with-scipy-optimize
>> http://www.scipy.org/ParallelProgramming
>>
>>
>> But I hope someone can guide me, which of these two methods I should go
>> for, and how to implement it?
>> I am little unsure about GIL, synchronisation?, and such things, which I
>> know none about.
>>
>> For the real data, I can see that I am always waiting for the call of the
>> leastsq fitting.
>> How can start a pool of cores when I go through fitting?
>>
>
> Have a look at http://pythonhosted.org/joblib/parallel.html, that should
> allow you to use all cores without much effort. It uses multiprocessing
> under the hood. That's assuming you have multiple fits that can run in
> parallel, which I think is the case. I at least see some fits in a for-loop.
>
> Ralf
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20130407/b4fd7456/attachment.html
More information about the SciPy-User
mailing list