[SciPy-user] incorrect variance in optimize.curvefit and leastsq
josef.pktd@gmai...
josef.pktd@gmai...
Fri Feb 13 10:21:41 CST 2009
> I think for the weighted least squares problem the weights should go
> into the SSE calculation. I didn't find directly the reference, but I
> am somewhat confident that this is correct, from the analogy to the
> transformed
> model ynew = y*weight where weight_i = 1/sigma_i in the linear case.
> But it's too late today to try to check this.
>
> SSE = np.sum((err*weight)**2)
>
I looked at the weighted function some more:
Since the error calculation for the s_sq uses `func =
_weighted_general_function`
the above weighting for the SSE is automatically done. But then, in
this case there is no renormalization
necessary when calculation s_sq. The special casing of the weighted
function should be dropped (the commented out part below)
if (len(ydata) > len(p0)) and pcov is not None:
s_sq = (func(popt, *args)**2).sum()/(len(ydata)-len(p0))
## if sigma is not None:
## s_sq /= (args[-1]**2).sum()
pcov = pcov * s_sq
else:
pcov = inf
Below is the test for the weighted case, rewritten from the
unweighted NIST example.
Note: I multiplied y and and f(x) by the weight, which then is
corrected again by the weighted least squares, and so I get the
original NIST estimates back. This test passes with the above
correction but fails in the current version, r5551.
Josef
-----------------------
def test_curvefit_weights():
'''test against NIST certified case at
http://www.itl.nist.gov/div898/strd/nls/data/misra1b.shtml
add weights to test curvefit with weights'''
data = array([[ 10.07, 77.6 ],
[ 14.73, 114.9 ],
[ 17.94, 141.1 ],
[ 23.93, 190.8 ],
[ 29.61, 239.9 ],
[ 35.18, 289. ],
[ 40.02, 332.8 ],
[ 44.82, 378.4 ],
[ 50.76, 434.8 ],
[ 55.05, 477.3 ],
[ 61.01, 536.8 ],
[ 66.4 , 593.1 ],
[ 75.47, 689.1 ],
[ 81.78, 760. ]])
y,x = data[:,0],data[:,1]
pstd_c = [3.1643950207E+00, 4.2547321834E-06]
popt_c = [3.3799746163E+02, 3.9039091287E-04]
SSE_c = 7.5464681533E-02
Rstd_c = 7.9301471998E-02
decimal = 5 #accuracy of parameter estimate and standard deviation
w = 1.0 + np.arange(len(x)) # 1/weights = sigma
print w
def funct(x, b1, b2):
#weighted function for y
return w * b1 * (1-(1+b2*x/2)**(-2))
y = w * y #weighted y
start1 = [500, 0.0001]
start2 = [300, 0.0002]
for start in [start1, start2]:
popt, pcov = curve_fit(funct, x, y, p0=start, sigma=w)
pstd = np.sqrt(np.diag(pcov))
assert_almost_equal(popt, popt_c, decimal=decimal)
assert_almost_equal(pstd, pstd_c, decimal=decimal)
---------------------------
More information about the SciPy-user
mailing list