[SciPy-User] fast small matrix multiplication with cython?

Keith Goodman kwgoodman@gmail....
Tue Dec 7 11:08:51 CST 2010

On Tue, Dec 7, 2010 at 8:52 AM, Dag Sverre Seljebotn
<dagss@student.matnat.uio.no> wrote:
> On 12/07/2010 04:35 PM, Keith Goodman wrote:
>> On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold<jsseabold@gmail.com>  wrote:
>>> I'm wondering if anyone might have a look at my cython code that does
>>> matrix multiplication and see where I can speed it up or offer some
>>> pointers/reading.  I'm new to Cython and my knowledge of C is pretty
>>> basic based on trial and (mostly) error, so I am sure the code is
>>> still very naive.
>>> <snip>
>>>     cdef ndarray[DOUBLE, ndim=2] out = PyArray_SimpleNew(2, dims, NPY_DOUBLE)
>> I'd like to reduce the overhead in creating the empty array. Using
>> PyArray_SimpleNew in Cython is faster than using np.empty but both are
>> slower than using np.empty without Cython. Have I done something
>> wrong? I suspect is has something to do with this line in the code
>> below: "cdef npy_intp *dims = [r, c]"
> Nope, unless something very strange is going on, that line would be
> ridiculously fast compared to the rest. Basically just copying two
> integers on the stack.
> Try PyArray_EMPTY?

PyArray_EMPTY is a little faster (but np.empty is still much faster):

>> timeit matmult(2,2)
1000000 loops, best of 3: 778 ns per loop

>> timeit matmult3(2,2)
1000000 loops, best of 3: 763 ns per loop

np.empty in python
>> timeit np.empty((2,2))
1000000 loops, best of 3: 470 ns per loop

def matmult3(int r, int c):
    cdef npy_intp *dims = [r, c]
    cdef ndarray[DOUBLE, ndim=2] out = PyArray_EMPTY(2, dims, NPY_FLOAT64, 0)
    return out

More information about the SciPy-User mailing list