[SciPy-user] linear regression

josef.pktd@gmai... josef.pktd@gmai...
Wed May 27 14:22:23 CDT 2009

On Wed, May 27, 2009 at 3:03 PM, Robert Kern <robert.kern@gmail.com> wrote:
> On Wed, May 27, 2009 at 13:28,  <josef.pktd@gmail.com> wrote:
>> On Wed, May 27, 2009 at 12:35 PM, ms <devicerandom@gmail.com> wrote:
>>> josef.pktd@gmail.com ha scritto:
>>>>> Have a look here <http://www.scipy.org/Cookbook/LinearRegression>
>>>> y = Beta0 + Beta1 * x + Beta2 * x**2   is the second order polynomial.
>>>> I also should have looked, polyfit returns the polynomial coefficients
>>>> but doesn't calculate the variance-covariance matrix or standard
>>>> errors of the OLS estimate.
>>> AFAIK, the ODR fitting routines return all these parameters, so one can
>>> maybe use that for linear fitting too.
>> you mean scipy.odr?
>> I never looked at it in details. Conceptionally it is very similar to
>> standard regression, but I've never seen an application for it, nor do
>> I know the probability theoretic or econometric background of it.
> ODR is nonlinear least-squares with errors in both variables (e.g.
> minimizing the weighted sum of squared distances from each point to
> the corresponding closest points on the curve rather than "straight
> down" as in OLS). scipy.odr implements both ODR and OLS. It also
> implements implicit regression, where the relationship between
> variables is not expressed as "y=f(x)" but "f(x,y)=0" such as fitting
> an ellipse.
>> The
>> results for many cases will be relatively close to standard least
>> squares.
>> A google search shows links to curve fitting but not to any
>> econometric theory. On the other hand, there is a very large
>> literature on how to treat measurement errors and endogeneity of
>> regressors for (standard) least squares and maximum likelihood.
> The extension is straightforward. ODR is really just a generalization
> of least-squares. Unfortunately, the links to the relevant papers seem
> to have died. I've put them up here:
> http://www.mechanicalkern.com/static/odr_vcv.pdf
> http://www.mechanicalkern.com/static/odr_ams.pdf
> http://www.mechanicalkern.com/static/odrpack_guide.pdf

Thanks for the links, I finally also found out that in Wikipedia it is
under "Total Regression". Under "Errors-in-Variables model" it says

Error-in-variables models can be estimated in several different ways.
Besides those outlined here, see:
        * total least squares for a method of fitting which does not
arise from a statistical model;

>From a brief reading, I think that the main limitation is that it
doesn't allow you to explicitly model the joint error structure. I
looks like, this will be implicitly done by the scaling factors and
other function parameters. But this is just my first impression.
While in econometrics the most common methods are instrumental
variables, and two-stage estimators, which both try to explicitly
remove the randomness in the regressors (at least the part that is
correlated with the regression error).

I just looked at the published docs for odr and they could use quite a
bit of reorganization (e.g docstring of odrpack is missing). Reading
the source files is currently more informative.


More information about the SciPy-user mailing list