[Numpy-discussion] f2py performance question from a rookie
Vasileios Gkinis
v.gkinis@nbi.ku...
Mon Aug 16 11:34:34 CDT 2010
Sturla Molden <sturla <at> molden.no> writes:
>
>
> > <font face="Andale Mono">It looks like the gain in performance is
> > rather low compared to tests i have seen elsewhere.<br>
> > <br>
> > Am I missing something here..?<br>
> > <br>
> > </font>Cheers...Vasilis<br>
>
> Turn HTML off please.
>
> Use time.clock(), not time.time().
>
> Try some tasks that actually takes a while. Tasks that take 10**-4 or
> 10**-3 seconds cannot be reliably timed on Windows or Linux.
>
Hi again,
After using time.clock, running f2py with the REPORT_ON_ARRAY_COPY enabled and
passing arrays as np.asfortranarray(array) to the fortran routines I still get a
slow performance on f2py. No copied arrays are reported.
Running on f2py with a 6499x6499 array takes about 1.2sec while the python for
loop does the job slightly faster at 0.9 sec.
Comparisons like this:
http://www.scipy.org/PerformancePython
indicate a 100x-1000x boost in performance with f2py when compared to
conventional python for loops.
Still quite puzzled...any help will be very much appreciated
Regards
Vasileios
The actual fortran subroutine is here:
subroutine fill_B(j, beta, lamda, mu, b)
integer, intent(in) :: j
integer :: row, col
real(kind = 8), intent(in) :: beta(j), lamda(j), mu(j)
real(kind = 8), intent(out) :: b(j,j)
do row=2,j-1
do col=1,j
if (col == row-1) then
b(row,col) = beta(row) - lamda(row) + mu(row)
elseif (col == row) then
b(row,col) = 1 - 2*beta(row)
elseif (col == row+1) then
b(row, col) = beta(row) + lamda(row) - mu(row)
else
b(row, col) = 0
endif
enddo
enddo
b(1,1) = 1
b(j,j) = 1
end
and the python code that calls it together with the alternative implementation
with conventional python for loops is here:
def crank_heat(self, profile_i, go, g1, beta, lamda, mu, usef2py = False):
"""
"""
##Crank Nicolson AX = BD
##Make matrix A (coefs of n+1 step
N = np.size(profile_i)
print N
t1 = time.clock()
if usef2py == True:
matrix_A = fill_A.fill_a(j = N, beta = beta, lamda = lamda, mu = mu)
matrix_B = fill_B.fill_b(j = N, beta = beta, lamda = lamda, mu = mu)
else:
matrix_A = np.zeros((N,N))
matrix_A[0,0] = 1
matrix_A[-1,-1] = 1
for row in np.arange(1,N-1):
matrix_A[row,row-1] = -(beta[row] - lamda[row] + mu[row])
matrix_A[row, row] = 1 + 2*beta[row]
matrix_A[row, row+1] = -(beta[row] + lamda[row] - mu[row])
#make matrix B
matrix_B = np.zeros((N,N))
matrix_B[0,0] = 1
matrix_B[-1,-1] = 1
for row in np.arange(1,N-1):
matrix_B[row,row-1] = beta[row] - lamda[row] + mu[row]
matrix_B[row, row] = 1 - 2*beta[row]
matrix_B[row, row+1] = beta[row] + lamda[row] - mu[row]
print("CN function time: %0.5e" %(time.clock()-t1))
matrix_C = np.dot(matrix_B, profile_i)
t1 = time.clock()
matrix_X = np.linalg.solve(matrix_A, matrix_C)
print("CN solve time: %0.5e" %(time.clock()-t1))
matrix_X[0] = go
matrix_X[-1] = g1
return matrix_X
More information about the NumPy-Discussion
mailing list