[SciPy-User] help speeding up a Runge-Kuta algorithm (cython, f2py, ...)

Sturla Molden sturla@molden...
Tue Aug 7 13:10:55 CDT 2012

On 07.08.2012 18:37, Ryan Krauss wrote:

> For many Runge-Kutta steps, your Cython code is 200 times faster than
> my pure Python version.  Fortran is still 1.6 times faster than the
> Cython version, but the Fortran version is much more work to code up.

Don't expect anything to be "faster than Fortran" for certain kind of 
numerical work. Cython has a certain overhead (larger than C and 
Fortran), and since it compiles to ANSI C (not ISO C) we cannot restrict 
pointers. But still, ~75% of Fortran performance is often acceptable! 
Another thing is you need to look at "scalability". How much of that 
extra runtime is constant due to differences between Cython and f2py? 
How much is variable due to the numerical kernel being faster in 
Fortran? Will differently sized problems give you the same overhead from 
using Cython? It often helps to plot a graph of the performance (mean 
and error bars) for various problem sizes, rather than benchmarking at 
one single point.

Correctness is always more important than speed. That is one thing to 
consider too. With Cython we can begin with a tested Python prototype 
and optimize along the way, using the Python profiler to pinpoint where 
it matters the most. Python, NumPy and Cython will not win the world 
championship of being "fastest on the CPU" for simple numerical kernels, 
but that is not the idea either. Implementing complex algorithms in 
Fortran can be a PITA compared to Python. But Cython helps us in a 
stright forward way to speed up Python code and/or interface with C or 
C++. Fortran is only nice for helping us scientists to avoid the pointer 
arithmetics of C, but Cython's memoryviews do that too.


More information about the SciPy-User mailing list