[SciPy-dev] code for incremental least squares

Stéfan van der Walt stefan@sun.ac...
Wed Feb 17 01:07:23 CST 2010


Hi Nathaniel

On 16 February 2010 23:56, Nathaniel Smith <njs@pobox.com> wrote:
> The basic idea inside 'update' is pretty elementary... if you have a
> model matrix
>  X = np.row_stack([X1, X2, X3, ...])
> then what we need for least squares calculations is
>  X'X = X1'X1 + X2'X2 + X3'X3 + ...
> and we can compute this sum incrementally as the X_i matrices arrive.

Forming the product A^T A is often a bad idea from a numerical
perspective.  In Chapter 3 of Ake Bjork's "Numerical Methods for Least
Squares Problems", he talks about "update" problems.  He mentions that
"...the solution should be accurate up to the limitation of data and
conditioning of the problem; i.e., a stable numerical method must be
used."

He describes the Kalman-based update method (which bears some
resemblance to yours), but says that "the main disadvantage...is its
serious sensitivity to roundoff errors.  The updating algorithms based
on orthogonal transformations developed in the following sections are
generally to be preferred."  He then goes into more detail on updating
the QR and Gram-Schmidt decompositions.

Not sure if that helps, but it may be worth reading that chapter.

> For the QR code I use a recurrence relation I stole from some slides
> by Simon Wood to compute R and Q'y incrementally; probably a real
> incremental QR (e.g., "AS274", which is what R's biglm package uses)
> would be better, but this one is easy to implement in terms of
> non-incremental QR.

Incremental QR is something we should implement in scipy.linalg, for sure.

Regards
Stéfan


More information about the SciPy-Dev mailing list