[SciPy-User] Generalized least square on large dataset

Charles R Harris charlesr.harris@gmail....
Wed Mar 7 22:00:24 CST 2012

On Wed, Mar 7, 2012 at 8:46 PM, Charles R Harris

> On Wed, Mar 7, 2012 at 7:39 PM, Peter Cimermančič <
> peter.cimermancic@gmail.com> wrote:
>> Hi,
>> I'd like to linearly fit the data that were NOT sampled independently. I
>> came across generalized least square method:
>> b=(X'*V^(-1)*X)^(-1)*X'*V^(-1)*Y
>> X and Y are coordinates of the data points, and V is a "variance matrix".
>> The equation is Matlab format - I've tried solving problem there too, bit
>> it didn't work - but eventually I'd like to be able to solve problems like
>> that in python. The problem is that due to its size (1000 rows and
>> columns), the V matrix becomes singular, thus un-invertable. Any
>> suggestions for how to get around this problem? Maybe using a way of
>> solving generalized linear regression problem other than GLS?
> Plain old least squares will probably do a decent job for the fit, where
> you will run into trouble is if you want to estimate the covariance. The
> idea of using the variance matrix is to transform the data set into
> independent observations of equal variance, but except in extreme cases
> that shouldn't really be necessary if you have sufficient data points.
> Weighting the data is a simple case of this that merely equalizes the
> variance, and it often doesn't make that much difference.
To expand a bit, if it is simply the case that the measurement errors
aren't independent and you know their covariance, then you want to minimize
(y - Ax)^T * cov^-1 * (y - ax) and if you factor cov^-1 into U^T * U, then
you can solve the ordinary least squares problem U*A*x = U*y. I can't
really tell what your data/problem is like without more details.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20120307/d5c292e3/attachment.html 

More information about the SciPy-User mailing list