[SciPy-User] fast small matrix multiplication with cython?

Keith Goodman kwgoodman@gmail....
Tue Dec 7 09:35:54 CST 2010


On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold <jsseabold@gmail.com> wrote:
> I'm wondering if anyone might have a look at my cython code that does
> matrix multiplication and see where I can speed it up or offer some
> pointers/reading.  I'm new to Cython and my knowledge of C is pretty
> basic based on trial and (mostly) error, so I am sure the code is
> still very naive.
>
> <snip>
>
>    cdef ndarray[DOUBLE, ndim=2] out = PyArray_SimpleNew(2, dims, NPY_DOUBLE)

I'd like to reduce the overhead in creating the empty array. Using
PyArray_SimpleNew in Cython is faster than using np.empty but both are
slower than using np.empty without Cython. Have I done something
wrong? I suspect is has something to do with this line in the code
below: "cdef npy_intp *dims = [r, c]"

PyArray_SimpleNew:
>> timeit matmult(2,2)
1000000 loops, best of 3: 773 ns per loop

np.empty in cython:
>> timeit matmult2(2,2)
1000000 loops, best of 3: 1.62 us per loop

np.empty in python:
>> timeit np.empty((2,2))
1000000 loops, best of 3: 465 ns per loop

Code:

import numpy as np
from numpy cimport float64_t, ndarray, NPY_DOUBLE, npy_intp

ctypedef float64_t DOUBLE

cdef extern from "numpy/arrayobject.h":
    cdef void import_array()
    cdef object PyArray_SimpleNew(int nd, npy_intp *dims, int typenum)

# initialize numpy
import_array()

def matmult(int r, int c):
    cdef npy_intp *dims = [r, c]  # Is there a faster way to do this?
    cdef ndarray[DOUBLE, ndim=2] out = PyArray_SimpleNew(2, dims, NPY_DOUBLE)
    return out

def matmult2(int r, int c):
    cdef ndarray[DOUBLE, ndim=2] out = np.empty((r, c), dtype=np.float64)
    return out


More information about the SciPy-User mailing list