[Numpy-discussion] numpy and math sqrt timings

Bruce Sherwood Bruce_Sherwood@ncsu....
Sat Dec 29 01:29:05 CST 2007

On the VPython list Scott Daniels suggested using try/except to deal 
with the problem of sqrt(5.5) being numpy.float64 and thereby making 
sqrt(5.5)*(VPython vector) not a (VPython vector), which ends up as a 
big performance hit on existing programs. I tried his suggestion and did 
some timing using the program shown below.

Using "from numpy import *", the numpy sqrt(5.5) gives 5.7 microsec per 
sqrt, whereas using "from math import *" a sqrt is only 0.8 microsec. 
Why is numpy so much slower than math on this simple case? For 
completeness I also timed the old Numeric sqrt, which was 14 microsec, 
so numpy is a big improvement, but still very slow compared to math.

Using Daniels's suggestion of first trying the math sqrt, falling 
through to the numpy sqrt only if the argument isn't a simple scalar, 
gives 1.3 microsec per sqrt on the simple case of a scalar argument. 
Shouldn't/couldn't numpy do something like this internally?

Bruce Sherwood

from math import *
mathsqrt = sqrt
from numpy import *
numpysqrt = sqrt
from time import clock

# 0.8 microsec for "raw" math sqrt
# 5.7 microsec for "raw" numpy sqrt
# 1.3 microsec if we try math sqrt first

def sqrt(x):
    try: return mathsqrt(x)
    except TypeError: return numpysqrt(x)

# Check that numpy sqrt is invoked on an array:
nums = array([1,2,3])
print sqrt(nums)
x = 5.5
N = 500000
t1 = clock()
for n in range(N):
    y = sqrt(x)
    y = sqrt(x)
    y = sqrt(x)
    y = sqrt(x)
    y = sqrt(x)
    y = sqrt(x)
    y = sqrt(x)
    y = sqrt(x)
    y = sqrt(x)
    y = sqrt(x)
t2 = clock()
for n in range(N):
t3 = clock()
# t3-t2 is the loop overhead (turns out negligible)
print "%i loops over 10 sqrt's takes %.1f seconds" % (N,t2-t1)
print "Total loop overhead = %.2f seconds (negligible)" % (t3-t2)
print "One sqrt takes %.1f microseconds" % (1e6*((t2-t1)-(t3-t2))/(10*N))

More information about the Numpy-discussion mailing list