[Numpy-discussion] f2py performance question from a rookie

Vasileios Gkinis v.gkinis@nbi.ku...
Mon Aug 16 11:34:34 CDT 2010


Sturla Molden <sturla <at> molden.no> writes:

> 
> 
> >       <font face="Andale Mono">It looks like the gain in performance is
> >         rather low compared to tests i have seen elsewhere.<br>
> >         <br>
> >         Am I missing something here..?<br>
> >         <br>
> >       </font>Cheers...Vasilis<br>
> 
> Turn HTML off please.
> 
> Use time.clock(), not time.time().
> 
> Try some tasks that actually takes a while. Tasks that take 10**-4 or
> 10**-3  seconds cannot be reliably timed on Windows or Linux.
> 


Hi again,

After using time.clock, running f2py with the REPORT_ON_ARRAY_COPY enabled and
passing arrays as np.asfortranarray(array) to the fortran routines I still get a
slow performance on f2py. No copied arrays are reported.

Running on f2py with a 6499x6499 array takes about 1.2sec while the python for
loop does the job slightly faster at 0.9 sec.

Comparisons like this:
http://www.scipy.org/PerformancePython
indicate a 100x-1000x boost in performance with f2py when compared to
conventional python for loops.

Still quite puzzled...any help will be very much appreciated

Regards
Vasileios 

The actual fortran subroutine is here:

subroutine fill_B(j, beta, lamda, mu, b)
integer, intent(in) ::  j
integer :: row, col
real(kind = 8),  intent(in) :: beta(j), lamda(j), mu(j)
real(kind = 8), intent(out) :: b(j,j)

do row=2,j-1
    do col=1,j
        if (col == row-1) then
            b(row,col) = beta(row) - lamda(row) + mu(row)
        elseif (col == row) then
            b(row,col) = 1 - 2*beta(row)
        elseif (col == row+1) then
            b(row, col) = beta(row) + lamda(row) - mu(row)
        else
            b(row, col) = 0
        endif
    enddo
enddo
b(1,1) = 1
b(j,j) = 1

end

and the python code that calls it together with the alternative implementation
with conventional python for loops is here:


def crank_heat(self, profile_i, go, g1, beta, lamda, mu, usef2py = False):
        """
        
        """
    
        ##Crank Nicolson AX = BD
        ##Make matrix A (coefs of n+1 step
        N = np.size(profile_i)
        print N
        t1 = time.clock()
        
        if usef2py == True:
            matrix_A = fill_A.fill_a(j = N, beta = beta, lamda = lamda, mu = mu)
            matrix_B = fill_B.fill_b(j = N, beta = beta, lamda = lamda, mu = mu)
        else:
            matrix_A = np.zeros((N,N))
            matrix_A[0,0] = 1
            matrix_A[-1,-1] = 1
            for row in np.arange(1,N-1):
                matrix_A[row,row-1] = -(beta[row] - lamda[row] + mu[row])
                matrix_A[row, row] = 1 + 2*beta[row]
                matrix_A[row, row+1] = -(beta[row] + lamda[row] - mu[row])
            
            
            #make matrix B
            matrix_B = np.zeros((N,N))
            matrix_B[0,0] = 1
            matrix_B[-1,-1] = 1
            for row in np.arange(1,N-1):
                matrix_B[row,row-1] = beta[row] - lamda[row] + mu[row]
                matrix_B[row, row] = 1 - 2*beta[row]
                matrix_B[row, row+1] = beta[row] + lamda[row] - mu[row]
        
        print("CN function time: %0.5e" %(time.clock()-t1))
        
        matrix_C = np.dot(matrix_B, profile_i)
        t1 = time.clock()
        matrix_X = np.linalg.solve(matrix_A, matrix_C)
        print("CN solve time: %0.5e" %(time.clock()-t1))
        matrix_X[0] = go
        matrix_X[-1] = g1
        
            
        
        return matrix_X







More information about the NumPy-Discussion mailing list