[Numpy-discussion] Changes to improve performance on small matricies
Gary Bishop
gb at cs.unc.edu
Wed Oct 3 12:08:08 CDT 2001
We are porting some code from Matlab that does many thousands of
operations on small matrices. We have found that some small changes to
LinearAlgebra.py, Numeric.py, and multiarraymodule.c make our code run
approximately twice as fast as with the unmodified Numeric-20.2 code.
Our changes specifically replace LinearAlgebra._castCopyAndTranspose
with a version we call _fastCopyAndTranspose that does most of the work
in C instead of calling Numeric.transpose, which calls arange, which is
quite expensive. We also optimized Numeric.dot to call a new function
multiarray.matrixproduct which does the axis swap on the fly instead of
calling swapaxes (which calls arange) and then calling innerproduct
(which as the very first step copies the transposed matrix to make it
contiguous).
Formerly our code spent much of its time in arange (which we never
explicitly call), dot, and _castCopyAndTranspose. The changes described
above eliminate this overhead.
I'm writing to ask if these changes might make a worthy patch to NumPy?
We have tested them with on Windows2k (both Native and under Cygwin)
and on Linux. Soon, we'll have a test on Mac OS X.1.
If anyone is interested, I will figure out how to generate a patch file
to submit.
Thanks
gb
More information about the Numpy-discussion
mailing list