[SciPy-User] Generalized least square on large dataset
Wed Mar 7 21:58:10 CST 2012
On Wed, Mar 7, 2012 at 10:46 PM, Charles R Harris
> On Wed, Mar 7, 2012 at 7:39 PM, Peter Cimermančič
> <firstname.lastname@example.org> wrote:
>> I'd like to linearly fit the data that were NOT sampled independently. I
>> came across generalized least square method:
>> X and Y are coordinates of the data points, and V is a "variance matrix".
>> The equation is Matlab format - I've tried solving problem there too, bit
>> it didn't work - but eventually I'd like to be able to solve problems like
>> that in python. The problem is that due to its size (1000 rows and columns),
>> the V matrix becomes singular, thus un-invertable. Any suggestions for how
>> to get around this problem? Maybe using a way of solving generalized linear
>> regression problem other than GLS?
> Plain old least squares will probably do a decent job for the fit, where you
> will run into trouble is if you want to estimate the covariance.
Are heteroscedasticity and (auto)correlation robust standard errors
popular in any field outside of economics/econometrics, so called
sandwich estimators of covariance matrix?
(estimate with OLS ignoring non-independent and non-identical noise,
but correct the covariance matrix)
I recently expanded this in statsmodels, and would like to start soon
some advertising in favor of sandwiches.
> The idea of
> using the variance matrix is to transform the data set into independent
> observations of equal variance, but except in extreme cases that shouldn't
> really be necessary if you have sufficient data points. Weighting the data
> is a simple case of this that merely equalizes the variance, and it often
> doesn't make that much difference.
> SciPy-User mailing list
More information about the SciPy-User