[SciPy-User] fast small matrix multiplication with cython?

Dag Sverre Seljebotn dagss@student.matnat.uio...
Tue Dec 7 02:51:59 CST 2010

On 12/07/2010 07:56 AM, Fernando Perez wrote:
> Hi Skipper,
> On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold<jsseabold@gmail.com>  wrote:
>> I'm wondering if anyone might have a look at my cython code that does
>> matrix multiplication and see where I can speed it up or offer some
>> pointers/reading.  I'm new to Cython and my knowledge of C is pretty
>> basic based on trial and (mostly) error, so I am sure the code is
>> still very naive.
> a few years ago I had a similar problem, and I ended up getting a very
> significant speedup by hand-coding a very unsafe, but very fast pure C
> extension just to compute these inner products.  This was basically a
> replacement for dot() that would only work with double precision
> inputs of compatible dimensions and would happily segfault with
> anything else, but it ran very fast.  The inner loop is implemented
> completely naively, but it still beats calls to BLAS (even linked with
> ATLAS) for small matrix dimensions (my case was also up to ~ 15x15).

Another idea: If the matrices are more in the intermediate range, here's 
a Cython library for calling BLAS more directly:


For intermediate-size matrices the use of SSE instructions should be 
able to offset any call overhead. Try to stay clear of using NumPy for 
slicing though, instead one should do pointer arithmetic...

Dag Sverre

More information about the SciPy-User mailing list