# [SciPy-User] Unexpected covariance matrix from scipy.optimize.curve_fit

josef.pktd@gmai... josef.pktd@gmai...
Wed Aug 31 20:45:14 CDT 2011

```On Wed, Aug 31, 2011 at 12:10 PM,  <josef.pktd@gmail.com> wrote:
> On Wed, Aug 31, 2011 at 11:09 AM, Christoph Deil
>>
>> On Aug 30, 2011, at 11:25 PM, josef.pktd@gmail.com wrote:
>>
>> On Tue, Aug 30, 2011 at 4:15 PM, Christoph Deil
>>
>> I noticed that scipy.optimize.curve_fit returns parameter errors that don't
>>
>> scale with sigma, the standard deviation of ydata, as I expected.
>>
>> Here is a code snippet to illustrate my point, which fits a straight line to
>>
>> five data points:
>>
>> import numpy as np
>>
>> from scipy.optimize import curve_fit
>>
>> x = np.arange(5)
>>
>> y = np.array([1, -2, 1, -2, 1])
>>
>> sigma = np.array([1,  2, 1,  2, 1])
>>
>> def f(x, a, b):
>>
>>     return a + b * x
>>
>> popt, pcov = curve_fit(f, x, y, p0=(0.42, 0.42), sigma=sigma)
>>
>> perr = np.sqrt(pcov.diagonal())
>>
>> print('*** sigma = {0} ***'.format(sigma))
>>
>> print('popt: {0}'.format(popt))
>>
>> print('perr: {0}'.format(perr))
>>
>> I get the following result:
>>
>> *** sigma = [1 2 1 2 1] ***
>>
>> popt: [  5.71428536e-01   1.19956213e-08]
>>
>> perr: [ 0.93867933  0.40391117]
>>
>> Increasing sigma by a factor of 10,
>>
>> sigma = 10 * np.array([1,  2, 1,  2, 1])
>>
>> I get the following result:
>>
>> *** sigma = [10 20 10 20 10] ***
>>
>> popt: [  5.71428580e-01  -2.27625699e-09]
>>
>> perr: [ 0.93895295  0.37079075]
>>
>> The best-fit values stayed the same as expected.
>>
>> But the error on the slope b decreased by 8% (the error on the offset a
>>
>> didn't change much)
>>
>> I would have expected fit parameter errors to increase with increasing
>>
>> errors on the data!?
>>
>> Is this a bug?
>>
>> No bug in the formulas. I tested all of them when curve_fit was added.
>>
>> However in your example the numerical cov lacks quite a bit of
>> precision. Trying your example with different starting values, I get a
>> 0.05 difference in your perr (std of parameter estimates).
>>
>> Trying smaller xtol and ftol doesn't change anything. (?)
>>
>> Making ftol = 1e-15 very small I get a different wrong result:
>> popt: [  5.71428580e-01  -2.27625699e-09]
>> perr: [ 0.92582011  0.59868281]
>> What do I have to do to get a correct answer (say to 5 significant digits)
>> from curve_fit for this simple example?
>>
>> Since it's linear
>>
>> import scikits.statsmodels.api as sm
>>
>> x = np.arange(5.)
>>
>> y = np.array([1, -2, 1, -2, 1.])
>>
>> sigma = np.array([1,  2, 1,  2, 1.])
>>
>> res = sm.WLS(y, sm.add_constant(x, prepend=True), weights=1./sigma**2).fit()
>>
>> res.params
>>
>> array([  5.71428571e-01,   1.11022302e-16])
>>
>> res.bse
>>
>> array([ 0.98609784,  0.38892223])
>>
>> res = sm.WLS(y, sm.add_constant(x, prepend=True),
>> weights=1./(sigma*10)**2).fit()
>>
>> res.params
>>
>> array([  5.71428571e-01,   1.94289029e-16])
>>
>> res.bse
>>
>> array([ 0.98609784,  0.38892223])
>>
>> rescaling doesn't change parameter estimates nor perr
>>
>> This is what I don't understand.
>> Why don't the parameter estimate errors increase with increasing errors
>> sigma on the data points?
>> If I have less precise measurements, the model parameters should be less
>> constrained?!
>> I was using MINUIT before I learned Scipy and the error definition for a
>> chi2 fit given in the MINUIT User Guide
>> http://wwwasdoc.web.cern.ch/wwwasdoc/minuit/node7.html
>> as well as the example results here
>> don't mention the factor s_sq that is used in curve_fit to scale pcov.
>> Is the error definition in the MINUIT manual wrong?
>> Can you point me to a web resource that explains why the s_sq factor needs
>> to be applied to the covariance matrix?
>
> It's standard text book information, but Wikipedia seems to be lacking
> a bit in this.
>
> for the linear case
> http://en.wikipedia.org/wiki/Ordinary_least_squares#Assuming_normality
>
> cov_params = sigma^2 (X'X)^{-1}
>
> for the non-linear case with leastsq, X is replaced by Jacobian,
> otherwise everything is the same.
>
> However, in your minuit links I saw only the Hessian mentioned (from
> very fast skimming the pages)
>
> With maximum likelihood, the inverse Hessian is the complete
> covariance matrix, no additional multiplication is necessary.
>
> Essentially, these are implementation details depending on how the
> estimation is calculated, and there are various ways of numerically
> approximating the Hessian.
> That's why this is described for optimize.leastsq (incorrectly as
> Chuck pointed out) and but not in optimize.curve_fit.
>
> With leastsquares are maximum likelihood, rescaling both y and
> f(x,params) has no effect on the parameter estimates, it's just like
> changing units of y, meters instead of centimeters.
>
> I guess scipy.odr would work differently, since it is splitting up the
> errors between y and x's, but I never looked at the details.
>
>
>>
>> Josef
>>
>>
>>
>> PS: I've attached a script to fit the two examples using statsmodels, scipy
>> and minuit (applying the s_sq factor myself).
>> Here are the results I get (who's right for the first example? why does
>> statsmodels only return on parameter value and error?):
>>     """Example from
>>     x = np.array([1  , 2  , 3  , 4  ])
>>     y = np.array([1.1, 2.1, 2.4, 4.3])
>>     sigma = np.array([0.1, 0.1, 0.2, 0.1])
>> statsmodels.api.WLS
>> popt: [ 1.04516129]
>> perr: [ 0.0467711]
>> scipy.optimize.curve_fit
>> popt: [  8.53964011e-08   1.04516128e+00]
>> perr: [ 0.27452122  0.09784324]
>
> that's what I get with example 1 when I run your script,
> I don't know why you have one params in your case
> (full_output threw an exception in curve_fit with scipy.__version__ '0.9.0'
>
> statsmodels.api.WLS
> popt: [ -6.66133815e-16   1.04516129e+00]
> perr: [ 0.33828314  0.12647671]
> scipy.optimize.curve_fit
> popt: [  8.53964011e-08   1.04516128e+00]
> perr: [ 0.27452122  0.09784324]
>
>
>> minuit
>> popt: [-4.851674617611934e-14, 1.0451612903225629]
>> perr: [ 0.33828315  0.12647671]

statsmodels.api.WLS
popt: [ -4.90926744e-16   1.04516129e+00]
perr: [ 0.33828314  0.12647671]
statsmodels NonlinearLS
popt: [ -3.92166386e-08   1.04516130e+00]
perr: [ 0.33828314  0.12647671]

finally, I got some bugs out of the weights handling, but still not fully tested

def run_nonlinearls():
from scikits.statsmodels.miscmodels.nonlinls import NonlinearLS

class Myfunc(NonlinearLS):

def _predict(self, params):
x = self.exog
a, b = params
return a + b*x

mod = Myfunc(y, x, sigma=sigma**2)
res = mod.fit(start_value=(0.042, 0.42))
print ('statsmodels NonlinearLS')
print('popt: {0}'.format(res.params))
print('perr: {0}'.format(res.bse))

The basics is the same as curve_fit using leastsq, but it uses complex
derivatives which are usually numerically very good.

So it looks like the problems with curve_fit in your example are only
in the numerically derivatives that leastsq is using for the Jacobian.

If leastsq is using only forward differences, then it might be better
to calculate the final Jacobian with centered differences. just a
guess.

>
> statsmodels and minuit agree pretty well
>
>>     """Example from
>> http://mail.scipy.org/pipermail/scipy-user/2011-August/030412.html"""
>>     x = np.arange(5)
>>     y = np.array([1, -2, 1, -2, 1])
>>     sigma = 10 * np.array([1,  2, 1,  2, 1])
>> statsmodels.api.WLS
>> popt: [  5.71428571e-01   7.63278329e-17]
>> perr: [ 0.98609784  0.38892223]
>> scipy.optimize.curve_fit
>> popt: [  5.71428662e-01  -8.73679511e-08]
>> perr: [ 0.97804034  0.3818681 ]
>> minuit
>> popt: [0.5714285714294132, 2.1449508835758024e-13]
>> perr: [ 0.98609784  0.38892223]

statsmodels.api.WLS
popt: [  5.71428571e-01   1.94289029e-16]
perr: [ 0.98609784  0.38892223]
statsmodels NonlinearLS
popt: [  5.71428387e-01   8.45750929e-08]
perr: [ 0.98609784  0.38892223]

Josef

>
> statsmodels and minuit agree,
>
> my guess is that the jacobian calculation of leastsq (curve_fit) is
> not very good in these examples. Maybe trying Dfun or the other
> options, epsfcn, will help.
>
> I was trying to see whether I get better results calculation the
> numerical derivatives in a different way, but had to spend the time
> fixing bugs.
> (NonlinearLS didn't work correctly with weights.)
>
> Josef
>
>>
>>
>>
>>
>> Looking at the source code I see that scipy.optimize.curve_fit multiplies
>>
>> the pcov obtained from scipy.optimize.leastsq by a factor s_sq:
>>
>> https://github.com/scipy/scipy/blob/master/scipy/optimize/minpack.py#L438
>>
>>     if (len(ydata) > len(p0)) and pcov is not None:
>>
>>         s_sq = (func(popt, *args)**2).sum()/(len(ydata)-len(p0))
>>
>>         pcov = pcov * s_sq
>>
>> If so is it possible to add an explanation to
>>
>> http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
>>
>> that pcov is multiplied with this s_sq factor and why that will give correct
>>
>> errors?
>>
>> After I noticed this issue I saw that this s_sq factor is mentioned in the
>>
>> cov_x return parameter description of leastsq,
>>
>> but I think it should be explained in curve_fit where it is applied, maybe
>>
>> leaving a reference in the cov_x leastsq description.
>>
>> Also it would be nice to mention the full_output option in the curve_fit
>>
>> docu, I only realized after looking at the source code that this was
>>
>> possible.
>>
>> Christoph
>>
>> _______________________________________________
>>
>> SciPy-User mailing list
>>
>> SciPy-User@scipy.org
>>
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>
```