[SciPy-User] fast small matrix multiplication with cython?
Pauli Virtanen
pav@iki...
Mon Dec 6 18:11:12 CST 2010
On Mon, 06 Dec 2010 17:34:19 -0500, Skipper Seabold wrote:
> I'm wondering if anyone might have a look at my cython code that does
> matrix multiplication and see where I can speed it up or offer some
> pointers/reading. I'm new to Cython and my knowledge of C is pretty
> basic based on trial and (mostly) error, so I am sure the code is still
> very naive.
You'll be hard pressed to do better than Numpy's dot. In the raw data
handling, BLAS is very likely faster than most things you can code
manually. Moreover, the Cython routine you write must have as much
overhead as dot() --- dealing with refcounting, allocating/dellocating
PyArrayObjects (which is expensive) etc.
If you are willing to give up wrapping each small matrix in a separate
Numpy ndarray, then you can expect to get additional speed gains.
(Although even in that case it could make more sense to call BLAS
routines to do the multiplication instead, unless your matrices are small
and of fixed size in which case the C compiler may be able to produce
some tightly optimized code.)
However, in many cases the small matrices can be just stuffed into a
single Numpy array. At the moment there is no "vectorized" matrix
multiplication routine, however, so that could be written e.g. in Cython.
