[Numpy-discussion] Surprising performance tweak in Cython
Eric Firing
efiring@hawaii....
Sun Jun 22 22:02:38 CDT 2008
Gael Varoquaux wrote:
> I tried tweak my Cython code for performance by manually inlining a small
> function, and ended up with a less performant code. I must confess I
> don't really understand what is going on here. If somebody has an
> explaination, I'd be delighted. The code follows.
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> from numpy import zeros
>
> # Make sure numpy is initialized.
> include "c_numpy.pxd"
>
> ##############################################################################
> cdef int inner_loop(float c_x, float c_y):
> cdef float x, y, x_buffer
> x = 0; y = 0
> cdef int i
> for i in range(50):
> x_buffer = x*x - y*y + c_x
> y = 2*x*y + c_y
> x = x_buffer
> if (x*x + x*y > 100):
The line above looks like a typo, and does not match your inline
version. Could that be making the difference?
Eric
> return 50 - i
>
> def do_Mandelbrot_cython():
> cdef ndarray threshold_time
> threshold_time = zeros((500, 500))
> cdef double *tp
> cdef float c_x, c_y
> cdef int i, j
> c_x = -1.5
> tp = <double*>threshold_time.data
> for i in range(500):
> c_y = -1
> for j in range(500):
> tp += 1
> c_y += 0.004
> tp[0] = inner_loop(c_x, c_y)
> c_x += 0.004
> return threshold_time
>
>
> def do_Mandelbrot_cython2():
> cdef ndarray threshold_time
> threshold_time = zeros((500, 500))
> cdef double *tp
> tp = <double*>threshold_time.data
> cdef float x, y, xbuffer, c_x, c_y
> cdef int i, j, n
> c_x = -1.5
> for i in range(500):
> c_y = -1
> for j in range(500):
> tp += 1
> c_y += 0.004
> x = 0; y = 0
> for n in range(50):
> x_buffer = x*x - y*y + c_x
> y = 2*x*y + c_y
> x = x_buffer
> if (x*x + y*y > 100):
> tp[0] = 50 -n
> break
> c_x += 0.004
> return threshold_time
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> And the timing I get are:
>
> In [2]: %timeit C.do_Mandelbrot_cython2()
> 10 loops, best of 3: 342 ms per loop
>
> In [3]: %timeit C.do_Mandelbrot_cython()
> 10 loops, best of 3: 126 ms per loop
>
> Cheers,
>
> Gaël
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
More information about the Numpy-discussion
mailing list