[SciPy-User] fast small matrix multiplication with cython?

Charles R Harris charlesr.harris@gmail....
Tue Dec 7 11:17:05 CST 2010


On Tue, Dec 7, 2010 at 10:05 AM, <josef.pktd@gmail.com> wrote:

> On Tue, Dec 7, 2010 at 11:53 AM, Charles R Harris
> <charlesr.harris@gmail.com> wrote:
> >
> >
> > On Tue, Dec 7, 2010 at 9:47 AM, Skipper Seabold <jsseabold@gmail.com>
> wrote:
> >>
> >> On Tue, Dec 7, 2010 at 9:54 AM, Charles R Harris
> >> <charlesr.harris@gmail.com> wrote:
> >> >
> >> >
> >> > On Mon, Dec 6, 2010 at 11:56 PM, Fernando Perez <fperez.net@gmail.com
> >
> >> > wrote:
> >> >>
> >> >> Hi Skipper,
> >> >>
> >> >> On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold <jsseabold@gmail.com
> >
> >> >> wrote:
> >> >> > I'm wondering if anyone might have a look at my cython code that
> does
> >> >> > matrix multiplication and see where I can speed it up or offer some
> >> >> > pointers/reading.  I'm new to Cython and my knowledge of C is
> pretty
> >> >> > basic based on trial and (mostly) error, so I am sure the code is
> >> >> > still very naive.
> >> >>
> >> >> a few years ago I had a similar problem, and I ended up getting a
> very
> >> >> significant speedup by hand-coding a very unsafe, but very fast pure
> C
> >> >> extension just to compute these inner products.  This was basically a
> >> >> replacement for dot() that would only work with double precision
> >> >> inputs of compatible dimensions and would happily segfault with
> >> >> anything else, but it ran very fast.  The inner loop is implemented
> >> >> completely naively, but it still beats calls to BLAS (even linked
> with
> >> >> ATLAS) for small matrix dimensions (my case was also up to ~ 15x15).
> >> >>
> >> >> I'm attaching the code in case you find it useful, please keep in
> mind
> >> >> I haven't compiled it in years, so it may have bit-rotted a little.
> >> >>
> >> >
> >> > Blas adds quite a bit of overhead for multiplying small matrices, but
> so
> >> > does calling from python. For implementing Kalman filters it might be
> >> > better
> >> > to write a whole Kalman class so that operations can be combined at
> the
> >> > c
> >> > level.
> >> >
> >> > Skipper, what kind of Kalman filter are you trying to implement?
> >> >
> >>
> >> It's just a linear Gaussian filter.  I use it to get the loglikelihood
> >> of a univariate ARMA series with exact initial conditions.  As it
> >> stands it is fairly inflexible, but if I can make it fast I would like
> >> to generalize it.
> >>
> >> There is a fair amount of scratch work in here, and some attempts at
> >> generalized state space models, but all the action for my purposes is
> >> in KalmanFilter.loglike
> >>
> >>
> >>
> http://bazaar.launchpad.net/~jsseabold/statsmodels/statsmodels-skipper/annotate/head%3A/scikits/statsmodels/tsa/kalmanf/kalmanf.py#L505<http://bazaar.launchpad.net/%7Ejsseabold/statsmodels/statsmodels-skipper/annotate/head%3A/scikits/statsmodels/tsa/kalmanf/kalmanf.py#L505>
> >>
> >> It's not terribly slow, but I have to maximize the likelihood using
> >> numerical derivatives, so it's getting called quite a few times.  A
> >> 1000 observation ARMA(2,2) series takes about 5-6 seconds on my
> >> machine with fmin_l_bfgs_b.
> >>
> >
> > Just a guess here, but the numerical derivative bit makes it sounds like
> you
> > are implementing a generalized Kalman filter. Have you looked at
> unscented
> > Kalman filters?
>
> It's still a linear filter, non-linear optimization comes in because
> the exact loglikelihood function for ARMA is non-linear in the
> coefficients.
> (There might be a way to calculate the derivative in the same loop,
> but that's a different issue.)
>
>
The unscented Kalman filter is a better way to estimate the covariance of a
non-linear process, think of it as a better integrator. If the propagation
is easy to compute, which seems to be the case here, it will probably save
you some time. You might even be able to use the basic idea and skip the
Kalman part altogether.

My general aim here is to optimize the algorithm first before getting caught
up in the details of matrix multiplication in c. Premature optimization and
all that.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20101207/5364b77f/attachment.html 


More information about the SciPy-User mailing list