# [SciPy-User] Generalized least square on large dataset

Charles R Harris charlesr.harris@gmail....
Wed Mar 7 21:46:53 CST 2012

```On Wed, Mar 7, 2012 at 7:39 PM, Peter Cimermančič <
peter.cimermancic@gmail.com> wrote:

> Hi,
>
> I'd like to linearly fit the data that were NOT sampled independently. I
> came across generalized least square method:
>
> b=(X'*V^(-1)*X)^(-1)*X'*V^(-1)*Y
>
> X and Y are coordinates of the data points, and V is a "variance matrix".
>
> The equation is Matlab format - I've tried solving problem there too, bit
> it didn't work - but eventually I'd like to be able to solve problems like
> that in python. The problem is that due to its size (1000 rows and
> columns), the V matrix becomes singular, thus un-invertable. Any
> suggestions for how to get around this problem? Maybe using a way of
> solving generalized linear regression problem other than GLS?
>
>
Plain old least squares will probably do a decent job for the fit, where
you will run into trouble is if you want to estimate the covariance. The
idea of using the variance matrix is to transform the data set into
independent observations of equal variance, but except in extreme cases
that shouldn't really be necessary if you have sufficient data points.
Weighting the data is a simple case of this that merely equalizes the
variance, and it often doesn't make that much difference.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20120307/46b80d93/attachment.html
```