[Numpy-discussion] Surprising performance tweak in Cython

Eric Firing efiring@hawaii....
Sun Jun 22 22:02:38 CDT 2008


Gael Varoquaux wrote:
> I tried tweak my Cython code for performance by manually inlining a small
> function, and ended up with a less performant code. I must confess I
> don't really understand what is going on here. If somebody has an
> explaination, I'd be delighted. The code follows.
> 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> from numpy import zeros
> 
> # Make sure numpy is initialized.
> include "c_numpy.pxd"
> 
> ##############################################################################
> cdef int inner_loop(float c_x, float c_y):
>     cdef float x, y, x_buffer
>     x = 0; y = 0
>     cdef int i
>     for i in range(50):
>         x_buffer = x*x - y*y + c_x
>         y = 2*x*y + c_y
>         x = x_buffer
>         if (x*x + x*y > 100):

The line above looks like a typo, and does not match your inline 
version.  Could that be making the difference?

Eric


>             return 50 - i
> 
> def do_Mandelbrot_cython():
>     cdef ndarray threshold_time 
>     threshold_time = zeros((500, 500))
>     cdef double *tp
>     cdef float c_x, c_y
>     cdef int i, j
>     c_x = -1.5
>     tp = <double*>threshold_time.data
>     for i in range(500):
>         c_y = -1
>         for j in range(500):
>             tp += 1
>             c_y += 0.004
>             tp[0] = inner_loop(c_x, c_y)
>         c_x += 0.004
>     return threshold_time
> 
> 
> def do_Mandelbrot_cython2():
>     cdef ndarray threshold_time
>     threshold_time = zeros((500, 500))
>     cdef double *tp
>     tp = <double*>threshold_time.data
>     cdef float x, y, xbuffer, c_x, c_y
>     cdef int i, j, n 
>     c_x = -1.5
>     for i in range(500):
>         c_y = -1
>         for j in range(500):
>             tp += 1
>             c_y += 0.004
>             x = 0; y = 0
>             for n in range(50):
>                 x_buffer = x*x - y*y + c_x
>                 y = 2*x*y + c_y
>                 x = x_buffer
>                 if (x*x + y*y > 100):
>                     tp[0] = 50 -n
>                     break
>         c_x += 0.004
>     return threshold_time
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 
> And the timing I get are:
> 
> In [2]: %timeit C.do_Mandelbrot_cython2()
> 10 loops, best of 3: 342 ms per loop
> 
> In [3]: %timeit C.do_Mandelbrot_cython()
> 10 loops, best of 3: 126 ms per loop
> 
> Cheers,
> 
> Gaël
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion



More information about the Numpy-discussion mailing list