[SciPy-User] fast small matrix multiplication with cython?

Pauli Virtanen pav@iki...
Mon Dec 6 18:11:12 CST 2010

On Mon, 06 Dec 2010 17:34:19 -0500, Skipper Seabold wrote:
> I'm wondering if anyone might have a look at my cython code that does
> matrix multiplication and see where I can speed it up or offer some
> pointers/reading.  I'm new to Cython and my knowledge of C is pretty
> basic based on trial and (mostly) error, so I am sure the code is still
> very naive.

You'll be hard pressed to do better than Numpy's dot. In the raw data 
handling, BLAS is very likely faster than most things you can code 
manually. Moreover, the Cython routine you write must have as much 
overhead as dot() --- dealing with refcounting, allocating/dellocating 
PyArrayObjects (which is expensive) etc.

If you are willing to give up wrapping each small matrix in a separate 
Numpy ndarray, then you can expect to get additional speed gains. 
(Although even in that case it could make more sense to call BLAS 
routines to do the multiplication instead, unless your matrices are small 
and of fixed size in which case the C compiler may be able to produce 
some tightly optimized code.)

However, in many cases the small matrices can be just stuffed into a 
single Numpy array. At the moment there is no "vectorized" matrix 
multiplication routine, however, so that could be written e.g. in Cython.

Pauli Virtanen

More information about the SciPy-User mailing list