[Numpy-discussion] performance problem
oliphant at ee.byu.edu
Mon Jan 30 13:34:07 CST 2006
Gerard Vermeulen wrote:
>>>>t1 = Timer('a <<= 8', 'import numarray as NX; a = NX.ones(10**6, NX.UInt32)')
>>>>t2 = Timer('a <<= 8', 'import numpy as NX; a = NX.ones(10**6, NX.UInt32)')
While ultimately, this slow-down was related to a coercion issue, I did
still wonder about the extra dereference in the 1-d loop when one of the
inputs is a scalar. So, I added a patch that checks for that case and
defines a different loop.
It seemed to give a small performance boost on my system. I'm wondering
if such special-case coding is wise in general. Are there other ways to
get C-compilers to produce faster code on modern machines?
More information about the Numpy-discussion