[SciPy-user] linear regression
Wed May 27 14:37:14 CDT 2009
On Wed, May 27, 2009 at 14:22, <firstname.lastname@example.org> wrote:
> On Wed, May 27, 2009 at 3:03 PM, Robert Kern <email@example.com> wrote:
>> On Wed, May 27, 2009 at 13:28, <firstname.lastname@example.org> wrote:
>>> On Wed, May 27, 2009 at 12:35 PM, ms <email@example.com> wrote:
>>>> firstname.lastname@example.org ha scritto:
>>>>>> Have a look here <http://www.scipy.org/Cookbook/LinearRegression>
>>>>> y = Beta0 + Beta1 * x + Beta2 * x**2 is the second order polynomial.
>>>>> I also should have looked, polyfit returns the polynomial coefficients
>>>>> but doesn't calculate the variance-covariance matrix or standard
>>>>> errors of the OLS estimate.
>>>> AFAIK, the ODR fitting routines return all these parameters, so one can
>>>> maybe use that for linear fitting too.
>>> you mean scipy.odr?
>>> I never looked at it in details. Conceptionally it is very similar to
>>> standard regression, but I've never seen an application for it, nor do
>>> I know the probability theoretic or econometric background of it.
>> ODR is nonlinear least-squares with errors in both variables (e.g.
>> minimizing the weighted sum of squared distances from each point to
>> the corresponding closest points on the curve rather than "straight
>> down" as in OLS). scipy.odr implements both ODR and OLS. It also
>> implements implicit regression, where the relationship between
>> variables is not expressed as "y=f(x)" but "f(x,y)=0" such as fitting
>> an ellipse.
>>> results for many cases will be relatively close to standard least
>>> A google search shows links to curve fitting but not to any
>>> econometric theory. On the other hand, there is a very large
>>> literature on how to treat measurement errors and endogeneity of
>>> regressors for (standard) least squares and maximum likelihood.
>> The extension is straightforward. ODR is really just a generalization
>> of least-squares. Unfortunately, the links to the relevant papers seem
>> to have died. I've put them up here:
> Thanks for the links, I finally also found out that in Wikipedia it is
> under "Total Regression". Under "Errors-in-Variables model" it says
> Error-in-variables models can be estimated in several different ways.
> Besides those outlined here, see:
> * total least squares for a method of fitting which does not
> arise from a statistical model;
> >From a brief reading, I think that the main limitation is that it
> doesn't allow you to explicitly model the joint error structure. I
> looks like, this will be implicitly done by the scaling factors and
> other function parameters. But this is just my first impression.
For "y=f(x)" models, this is true. Both y and x can be multivariate,
and you can express the covariance of the uncertainties for each, but
not covariance between the y and x uncertainties. This is because of
the numerical tricks used for efficient implementation. However,
"f(x)=0" models can express covariances between all dimensions of x.
"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
-- Umberto Eco
More information about the SciPy-user