[SciPy-User] Generalized least square on large dataset

josef.pktd@gmai... josef.pktd@gmai...
Wed Mar 7 21:58:10 CST 2012

On Wed, Mar 7, 2012 at 10:46 PM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
> On Wed, Mar 7, 2012 at 7:39 PM, Peter Cimermančič
> <peter.cimermancic@gmail.com> wrote:
>> Hi,
>> I'd like to linearly fit the data that were NOT sampled independently. I
>> came across generalized least square method:
>> b=(X'*V^(-1)*X)^(-1)*X'*V^(-1)*Y
>> X and Y are coordinates of the data points, and V is a "variance matrix".
>> The equation is Matlab format - I've tried solving problem there too, bit
>> it didn't work - but eventually I'd like to be able to solve problems like
>> that in python. The problem is that due to its size (1000 rows and columns),
>> the V matrix becomes singular, thus un-invertable. Any suggestions for how
>> to get around this problem? Maybe using a way of solving generalized linear
>> regression problem other than GLS?
> Plain old least squares will probably do a decent job for the fit, where you
> will run into trouble is if you want to estimate the covariance.

side question:
Are heteroscedasticity and (auto)correlation robust standard errors
popular in any field outside of economics/econometrics, so called
sandwich estimators of covariance matrix?
(estimate with OLS ignoring non-independent and non-identical noise,
but correct the covariance matrix)

I recently expanded this in statsmodels, and would like to start soon
some advertising in favor of sandwiches.


> The idea of
> using the variance matrix is to transform the data set into independent
> observations of equal variance, but except in extreme cases that shouldn't
> really be necessary if you have sufficient data points. Weighting the data
> is a simple case of this that merely equalizes the variance, and it often
> doesn't make that much difference.
> Chuck
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

More information about the SciPy-User mailing list