[SciPy-User] Speeding up Python Again
Ralf Gommers
ralf.gommers@googlemail....
Thu Aug 25 02:05:03 CDT 2011
On Tue, Aug 23, 2011 at 11:11 AM, Rajeev Singh <rajs2010@gmail.com> wrote:
>
>
> On Wed, Aug 10, 2011 at 6:48 PM, Rajeev Singh <rajs2010@gmail.com> wrote:
> > Hi,
> > I was trying out the codes discussed
> > at
> http://technicaldiscovery.blogspot.com/2011/07/speeding-up-python-again.html
> > Here is a summary of my results -
> > Computer: Desktop imsc9 aravali annapurna
> > NumPy: 7.651419 4.219105 5.576453 4.858640
> > Cython: 4.259419 3.477259 3.204909 2.357819
> > Weave: 4.302778 * 3.298551 2.400000
> > Looped Fortran: 4.199148 3.414484 3.202963 2.315644
> > Vectorized Fortran: 3.118410 2.131966 1.512303 1.460251
> > pure fortran update1: 1.205727 1.964857 2.034688 1.336086
> > pure fortran update2: 0.600848 0.604649 0.573593 0.721339
> > imsc9, aravali and annapurna are HPC machines at my institute
> > * for some reason Weave didn't compile on imsc9
> >
> > Indeed there is about a factor of 7 to 12 difference between pure fortran
> > with update2 (vectorized) and the numpy version.
> > I should mention that I changed N to 150 in laplace_for.f90
> > Rajeev
>
> Hi,
>
> Continuing the comparison of various ways of implementing solving laplace
> equation, following result might interest you -
>
> Desktop imsc9 aravali annapurna
> Octave (0): 20.7866 * 21.6179 *
> Vectorized Fortran (pure) (1): 0.7487 0.6501 0.7507 1.1619
> Vectorized Fortran (f2py) (2): 0.7190 0.6089 0.6243 1.0312
> NumPy (3): 4.1343 2.5844 2.6565 3.7445
> Cython (4): 1.7273 1.9927 2.0471 1.3525
> Cython with C (5): 1.7248 1.9665 2.0354 1.3367
> Weave (6): 1.9818 * 2.1326 1.4003
> Looped Fortran (f2py) (7): 1.6996 1.9657 2.0429 1.3354
> Looped Fortran (pure) (8): 1.7189 2.0145 2.0917 1.5086
> C (pure) (9): 1.2820 1.9948 2.0527 1.4259
>
> imsc9, aravali and annapurna are HPC machines at my institute
> * for some reason Weave didn't compile on imsc9
> * octave isn't installed on imsc9 and annapurna
>
> The difference between numpy and fortran performance seems significant.
> However f2py does as well as pure fortran now. The difference from earlier
> case is that earlier there was a division inside the loop which I have
> replaced by multiplication by reciprocal. This does not affect the result
> but makes the execution faster in all cases except pure fortran (I guess
> fortran compiler was already doing it).
>
> I would be happy to give all the codes if someone is interested. Should we
> update the performance python page at scipy with these codes?
>
> It would be nice to this to http://www.scipy.org/PerformancePython. That
page currently has only one problem, to see a few different ones compared
with the same method gives a better impression of speed differences.
It's a wiki page, so you should be able to add your code, problem
description and results yourself.
Cheers,
Ralf
