[Numpy-discussion] Linear least squares
Charles R Harris
Wed Jan 9 10:42:30 CST 2013
On Wed, Jan 9, 2013 at 1:29 AM, Till Stensitz <firstname.lastname@example.org> wrote:
> Nathaniel Smith <njs <at> pobox.com> writes:
> > An obvious thing is that it always computes residuals, which could be
> > costly; if your pinv code isn't doing that then it's not really
> > comparable. (Though might still be well-suited for your actual
> > problem.)
> > Depending on how well-conditioned your problems are, and how much
> > speed you need, there are faster ways than pinv as well. (Going via qr
> > might or might not, going via cholesky almost certainly will be.)
> > -n
> You are right. With calculating the residuals, the speedup goes
> down to a factor of 2. I had to calculate the residuals anyways because
> lstsq only returns the squared sum of the residuals, while i need every
> residual (as an input to optimize.leastsq).
Same here. Unfortunately the residuals computed by the LAPACK function are
in a different basis so aren't directly usable. I'd support adding a
keyword to disable the usual computation of the sum of squares.
Josef is also right, it is shape depended. For his example, lstsq is faster.
> Maybe it is possible to make lstsq to choose its method automatically?
> Or some keyword to set the method and making other decompositions
QR without column pivoting is a nice option for "safe" problems, but it
doesn't provide a reliable indication of rank reduction. I also don't find
pinv useful once the rank goes down, since it relies on Euclidean distance
having relevance in parameter space and that is seldom a sound assumption,
usually it is better to reformulate the problem or remove a column from the
design matrix. So maybe an 'unsafe', or less suggestively, 'fast' keyword
could also be an option. IIRC, this was discussed on the scipy mailing list
a year or two ago.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion