[SciPy-User] Speeding up Python Again
Rajeev Singh
rajs2010@gmail....
Tue Aug 23 04:11:41 CDT 2011
On Wed, Aug 10, 2011 at 6:48 PM, Rajeev Singh wrote:
> Hi,
> I was trying out the codes discussed
> at
http://technicaldiscovery.blogspot.com/2011/07/speeding-up-python-again.html
> Here is a summary of my results -
> Computer: Desktop imsc9 aravali annapurna
> NumPy: 7.651419 4.219105 5.576453 4.858640
> Cython: 4.259419 3.477259 3.204909 2.357819
> Weave: 4.302778 * 3.298551 2.400000
> Looped Fortran: 4.199148 3.414484 3.202963 2.315644
> Vectorized Fortran: 3.118410 2.131966 1.512303 1.460251
> pure fortran update1: 1.205727 1.964857 2.034688 1.336086
> pure fortran update2: 0.600848 0.604649 0.573593 0.721339
> imsc9, aravali and annapurna are HPC machines at my institute
> * for some reason Weave didn't compile on imsc9
>
> Indeed there is about a factor of 7 to 12 difference between pure fortran
> with update2 (vectorized) and the numpy version.
> I should mention that I changed N to 150 in laplace_for.f90
> Rajeev
Hi,
Continuing the comparison of various ways of implementing solving laplace
equation, following result might interest you -
Desktop imsc9 aravali annapurna
Octave (0): 20.7866 * 21.6179 *
Vectorized Fortran (pure) (1): 0.7487 0.6501 0.7507 1.1619
Vectorized Fortran (f2py) (2): 0.7190 0.6089 0.6243 1.0312
NumPy (3): 4.1343 2.5844 2.6565 3.7445
Cython (4): 1.7273 1.9927 2.0471 1.3525
Cython with C (5): 1.7248 1.9665 2.0354 1.3367
Weave (6): 1.9818 * 2.1326 1.4003
Looped Fortran (f2py) (7): 1.6996 1.9657 2.0429 1.3354
Looped Fortran (pure) (8): 1.7189 2.0145 2.0917 1.5086
C (pure) (9): 1.2820 1.9948 2.0527 1.4259
imsc9, aravali and annapurna are HPC machines at my institute
* for some reason Weave didn't compile on imsc9
* octave isn't installed on imsc9 and annapurna
The difference between numpy and fortran performance seems significant.
However f2py does as well as pure fortran now. The difference from earlier
case is that earlier there was a division inside the loop which I have
replaced by multiplication by reciprocal. This does not affect the result
but makes the execution faster in all cases except pure fortran (I guess
fortran compiler was already doing it).
I would be happy to give all the codes if someone is interested. Should we
update the performance python page at scipy with these codes?
Rajeev
