[SciPy-user] incorrect variance in optimize.curvefit and leastsq
Fri Feb 13 00:08:03 CST 2009
On Fri, Feb 13, 2009 at 12:24 AM, Travis E. Oliphant
> firstname.lastname@example.org wrote:
>> On Thu, Feb 12, 2009 at 11:09 PM, Travis E. Oliphant
>> <email@example.com> wrote:
>>> firstname.lastname@example.org wrote:
>>>> I just saw the new optimize.curvefit which provides a wrapper around
>> the standard deviation of the error can be calculated and the
>> corrected (this is written for the use from outside of curvefit):
>> yhat = func(x,popt, popt) # get predicted observations
>> SSE = np.sum((y-yhat)**2)
>> sig2 = SSE/(len(y)-len(popt))
>> ecov = sig2*pcov # this is the variance-covariance matrix of
>> the parameter estimates
>> inside curvefit, this work (before the return):
>> err = func(popt, *args)
>> SSE = np.sum((err)**2)
>> sig2 = SSE / (len(ydata) - len(popt))
>> pcov = sig2 * pcov
I think for the weighted least squares problem the weights should go
into the SSE calculation. I didn't find directly the reference, but I
am somewhat confident that this is correct, from the analogy to the
model ynew = y*weight where weight_i = 1/sigma_i in the linear case.
But it's too late today to try to check this.
SSE = np.sum((err*weight)**2)
> Thanks very much for these... I appreciate your help in getting
> curve_fit in better shape. These functions were essentially added to
> curvefit so now curve_fit can be tested against those NIST pages.
> Thank you also for the unit tests.
> A good project idea for somebody wanting to get their feet wet with
> SciPy would be to write unit-tests of curve_fit against some more of
> those NIST data-sets --- those look nice. A combination of loadtxt and
> additional parsing fu and you could automate the whole thing --- nice
> project for a student out there ;-)
Yes, and they are also useful to test other optimization functions.
Unfortunately, I didn't see any cases with several explanatory
variables or with weights/heteroscedasticity. There are also certified
reference cases for basic statistics and linear regression, that also
would be useful.
I will try to look during weekend at the weighted case a bit more.
More information about the SciPy-user