[SciPy-User] help speeding up a Runge-Kuta algorithm (cython, f2py, ...)

Sturla Molden sturla@molden...
Sat Aug 4 18:28:48 CDT 2012

Not tested and debugged, but to me it looks like something like this 
might be what you want.


Den 03.08.2012 19:02, skrev Ryan Krauss:
> I need help speeding up some code I wrote to perform a Runge-Kuta
> integration.  I need to do the integration as part of a real-time
> control algorithm, so it needs to be fairly fast.
> scipy.integrate.odeint does too much error checking to be fast enough.
>   My pure Python version was just a little too slow, so I tried coding
> it up in Cython.  I have only used Cython once before, so I don't know
> if I did it correctly (the .pyx file is attached).
> The code runs just fine, but there is almost no speed up.  I think the
> core issue is that my dxdt_runge_kuta function gets called about 4000
> times per second, so most of my overhead is in the function calls (I
> think).  I am running my real-time control algorithm at 500 Hz and I
> need at least 2 Runge-Kuta integration steps per real-time steps for
> numeric stability.  And the Runge-Kuta algorithm needs to evaluate the
> derivative 4 times per times step.  So, 500 Hz * 2 * 4 = 4000 calls
> per second.
> I also tried coding this up in fortran and using f2py, but I am
> getting a type mismatch error I don't understand.  I have a function
> that declares its return values as double precision:
> double precision function dzdt(x,voltage)
> and I declare the variable I want to store the returned value in to
> also be double precision:
> double precision F,z,vel,accel,zdot1,zdot2,zdot3,zdot4
> zdot1 = dzdt(x_prev,volts)
> but some how it is not happy.
> My C skills are pretty weak (the longer I use Python, the more C I
> forget, and I didn't know that much to start with).  I started looking
> into Boost as well as using f2py on C code, but I got stuck.
> Can anyone either make my Cython or Fortran approaches work or point
> me in a different direction?
> Thanks,
> Ryan
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20120805/92056bbf/attachment.html 
-------------- next part --------------
#cython: boundscheck=False
#cython: wraparound=False

import numpy as np
from libc.math cimport exp, fabs

cdef inline void dxdt_runge_kuta(double *x "const double *", 
                                 double voltage "const double", 
                                 double *dxdt):

    cdef double J = 0.0011767297528720126 "const double" # these are all constants, using the "const trick"
    cdef double alpha0 = 4.1396263800000002 "const double"
    cdef double alpha1 = 8.7203167700000002 "const double"
    cdef double alpha2 = 0.010979707700000001 "const double"
    cdef double sigma0 = 2.391621670693391 "const double"
    cdef double sigma1 = 0.005368467489711875 "const double"
    cdef double v_gain = 1.3319218899042409 "const double"
    cdef double v_stribeck = 655.33634300000006 "const double"

    cdef double vel = x[1]
    cdef double z = x[2]
    cdef double g = alpha0 + alpha1*exp(-1*(vel/v_stribeck)**2)
    cdef double dzdt = vel - sigma0*z*fabs(vel)/g
    cdef double F = sigma0*z + sigma1*dzdt + alpha2*vel
    cdef double accel = 1.0/J*(voltage*v_gain - F)
    dxdt[0] = vel
    dxdt[1] = accel
    dxdt[2] = dzdt

def runge_kuta_one_step(double _x[::1], Py_ssize_t factor, double volts,
                        double dt_rk):

    cdef double g1[4], g2[4], g3[4], g4[4], dxdt[4], tmpx[4] # padding with one element for better alignment
    cdef double x[4]
    cdef double out[::1] = np.zeros(3)
    cdef Py_ssize_t q
    cdef int i    
    cdef double *x_prev, *x_out, *swap
    x_out = &out[0]
    x_prev = &x[0]
    x[0] = _x[0]
    x[1] = _x[1]
    x[2] = _x[2]

    for q in range(factor):

        dxdt_runge_kuta(x_prev, volts, dxdt)

        for i in range(3):
            g1[i] = dt_rk * dxdt[i]
            tmpx[i] = x_prev[i] + 0.5*g1[i]
        dxdt_runge_kuta(tmpx, volts, dxdt)

        for i in range(3):
            g2[i] = dt_rk * dxdt[i]
            tmpx[i] = x_prev[i] + 0.5*g2[i]
        dxdt_runge_kuta(tmpx, volts, dxdt)

        for i in range(3):
            g3[i] = dt_rk * dxdt[i]
            tmpx[i] = x_prev[i] + g3[i]

        dxdt_runge_kuta(tmpx, volts, dxdt)

        for i in range(3):
            g4[i] = dt_rk * dxdt[i]

        for i in range(3):
            x_out[i] = x_prev[i] + 1.0/6*(g1[i] + 2*g2[i] + 2*g3[i] + g4[i])
        swap = x_prev
        x_prev = x_out
        x_out = swap

    for i in range(3):
        out[i] = x_prev[i]

    return out

More information about the SciPy-User mailing list