[SciPy-User] fast small matrix multiplication with cython?

Skipper Seabold jsseabold@gmail....
Mon Dec 6 18:30:29 CST 2010

On Mon, Dec 6, 2010 at 7:11 PM, Pauli Virtanen <pav@iki.fi> wrote:
> On Mon, 06 Dec 2010 17:34:19 -0500, Skipper Seabold wrote:
> > I'm wondering if anyone might have a look at my cython code that does
> > matrix multiplication and see where I can speed it up or offer some
> > pointers/reading.  I'm new to Cython and my knowledge of C is pretty
> > basic based on trial and (mostly) error, so I am sure the code is still
> > very naive.
> You'll be hard pressed to do better than Numpy's dot. In the raw data
> handling, BLAS is very likely faster than most things you can code
> manually. Moreover, the Cython routine you write must have as much
> overhead as dot() --- dealing with refcounting, allocating/dellocating
> PyArrayObjects (which is expensive) etc.
> If you are willing to give up wrapping each small matrix in a separate
> Numpy ndarray, then you can expect to get additional speed gains.
> (Although even in that case it could make more sense to call BLAS
> routines to do the multiplication instead, unless your matrices are small
> and of fixed size in which case the C compiler may be able to produce
> some tightly optimized code.)
> However, in many cases the small matrices can be just stuffed into a
> single Numpy array. At the moment there is no "vectorized" matrix
> multiplication routine, however, so that could be written e.g. in Cython.

Ah, I see.  I didn't think about the overhead of PyArrayObject.


More information about the SciPy-User mailing list