[SciPy-User] Unexpected covariance matrix from scipy.optimize.curve_fit
Charles R Harris
charlesr.harris@gmail....
Thu Sep 1 07:18:12 CDT 2011
On Thu, Sep 1, 2011 at 4:04 AM, Christoph Deil <
deil.christoph@googlemail.com> wrote:
>
> On Sep 1, 2011, at 3:45 AM, josef.pktd@gmail.com wrote:
>
> > On Wed, Aug 31, 2011 at 12:10 PM, <josef.pktd@gmail.com> wrote:
> >> On Wed, Aug 31, 2011 at 11:09 AM, Christoph Deil
> >> <deil.christoph@googlemail.com> wrote:
> >>>
> >>> On Aug 30, 2011, at 11:25 PM, josef.pktd@gmail.com wrote:
> >>>
> >>> On Tue, Aug 30, 2011 at 4:15 PM, Christoph Deil
> >>> <deil.christoph@googlemail.com> wrote:
> >>>
> >>> I noticed that scipy.optimize.curve_fit returns parameter errors that
> don't
> >>>
> >>> scale with sigma, the standard deviation of ydata, as I expected.
> >>>
> >>> Here is a code snippet to illustrate my point, which fits a straight
> line to
> >>>
> >>> five data points:
> >>>
> >>> import numpy as np
> >>>
> >>> from scipy.optimize import curve_fit
> >>>
> >>> x = np.arange(5)
> >>>
> >>> y = np.array([1, -2, 1, -2, 1])
> >>>
> >>> sigma = np.array([1, 2, 1, 2, 1])
> >>>
> >>> def f(x, a, b):
> >>>
> >>> return a + b * x
> >>>
> >>> popt, pcov = curve_fit(f, x, y, p0=(0.42, 0.42), sigma=sigma)
> >>>
> >>> perr = np.sqrt(pcov.diagonal())
> >>>
> >>> print('*** sigma = {0} ***'.format(sigma))
> >>>
> >>> print('popt: {0}'.format(popt))
> >>>
> >>> print('perr: {0}'.format(perr))
> >>>
> >>> I get the following result:
> >>>
> >>> *** sigma = [1 2 1 2 1] ***
> >>>
> >>> popt: [ 5.71428536e-01 1.19956213e-08]
> >>>
> >>> perr: [ 0.93867933 0.40391117]
> >>>
> >>> Increasing sigma by a factor of 10,
> >>>
> >>> sigma = 10 * np.array([1, 2, 1, 2, 1])
> >>>
> >>> I get the following result:
> >>>
> >>> *** sigma = [10 20 10 20 10] ***
> >>>
> >>> popt: [ 5.71428580e-01 -2.27625699e-09]
> >>>
> >>> perr: [ 0.93895295 0.37079075]
> >>>
> >>> The best-fit values stayed the same as expected.
> >>>
> >>> But the error on the slope b decreased by 8% (the error on the offset a
> >>>
> >>> didn't change much)
> >>>
> >>> I would have expected fit parameter errors to increase with increasing
> >>>
> >>> errors on the data!?
> >>>
> >>> Is this a bug?
> >>>
> >>> No bug in the formulas. I tested all of them when curve_fit was added.
> >>>
> >>> However in your example the numerical cov lacks quite a bit of
> >>> precision. Trying your example with different starting values, I get a
> >>> 0.05 difference in your perr (std of parameter estimates).
> >>>
> >>> Trying smaller xtol and ftol doesn't change anything. (?)
> >>>
> >>> Making ftol = 1e-15 very small I get a different wrong result:
> >>> popt: [ 5.71428580e-01 -2.27625699e-09]
> >>> perr: [ 0.92582011 0.59868281]
> >>> What do I have to do to get a correct answer (say to 5 significant
> digits)
> >>> from curve_fit for this simple example?
> >>>
> >>> Since it's linear
> >>>
> >>> import scikits.statsmodels.api as sm
> >>>
> >>> x = np.arange(5.)
> >>>
> >>> y = np.array([1, -2, 1, -2, 1.])
> >>>
> >>> sigma = np.array([1, 2, 1, 2, 1.])
> >>>
> >>> res = sm.WLS(y, sm.add_constant(x, prepend=True),
> weights=1./sigma**2).fit()
> >>>
> >>> res.params
> >>>
> >>> array([ 5.71428571e-01, 1.11022302e-16])
> >>>
> >>> res.bse
> >>>
> >>> array([ 0.98609784, 0.38892223])
> >>>
> >>> res = sm.WLS(y, sm.add_constant(x, prepend=True),
> >>> weights=1./(sigma*10)**2).fit()
> >>>
> >>> res.params
> >>>
> >>> array([ 5.71428571e-01, 1.94289029e-16])
> >>>
> >>> res.bse
> >>>
> >>> array([ 0.98609784, 0.38892223])
> >>>
> >>> rescaling doesn't change parameter estimates nor perr
> >>>
> >>> This is what I don't understand.
> >>> Why don't the parameter estimate errors increase with increasing errors
> >>> sigma on the data points?
> >>> If I have less precise measurements, the model parameters should be
> less
> >>> constrained?!
> >>> I was using MINUIT before I learned Scipy and the error definition for
> a
> >>> chi2 fit given in the MINUIT User Guide
> >>> http://wwwasdoc.web.cern.ch/wwwasdoc/minuit/node7.html
> >>> as well as the example results here
> >>> http://code.google.com/p/pyminuit/wiki/GettingStartedGuide
> >>> don't mention the factor s_sq that is used in curve_fit to scale pcov.
> >>> Is the error definition in the MINUIT manual wrong?
> >>> Can you point me to a web resource that explains why the s_sq factor
> needs
> >>> to be applied to the covariance matrix?
> >>
> >> It's standard text book information, but Wikipedia seems to be lacking
> >> a bit in this.
> >>
> >> for the linear case
> >> http://en.wikipedia.org/wiki/Ordinary_least_squares#Assuming_normality
> >>
> >> cov_params = sigma^2 (X'X)^{-1}
> >>
> >> for the non-linear case with leastsq, X is replaced by Jacobian,
> >> otherwise everything is the same.
> >>
> >> However, in your minuit links I saw only the Hessian mentioned (from
> >> very fast skimming the pages)
> >>
> >> With maximum likelihood, the inverse Hessian is the complete
> >> covariance matrix, no additional multiplication is necessary.
> >>
> >> Essentially, these are implementation details depending on how the
> >> estimation is calculated, and there are various ways of numerically
> >> approximating the Hessian.
> >> That's why this is described for optimize.leastsq (incorrectly as
> >> Chuck pointed out) and but not in optimize.curve_fit.
> >>
> >> With leastsquares are maximum likelihood, rescaling both y and
> >> f(x,params) has no effect on the parameter estimates, it's just like
> >> changing units of y, meters instead of centimeters.
> >>
> >> I guess scipy.odr would work differently, since it is splitting up the
> >> errors between y and x's, but I never looked at the details.
>
> OK, now I understand, thanks for your explanations.
> Roughly speaking, scipy.optimize.curve_fit applies a factor s_sq = chi^2 /
> ndf to the covariance matrix to account for possibly incorrect overall scale
> in the y errors "sigma" of the data.
>
> I had simply not seen this factor being applied to chi^2 fits before. E.g.
> in many physics and astronomy papers, parameter errors from chi^2 fit
> results are reported without this factor. Also the manual of the fitting
> package used by most physicists (MINUIT) as well as the statistics textbook
> I use (Cowan -- Statistical data analysis) don't mention it.
>
> May I therefore suggest to explicitly mention this factor in the
> scipy.optimize.curve_fit docstring to avoid confusion?
>
>
Note that most texts will tell you that \sigma^2 should be estimated
independently of the data, i.e., from the known precision of the
measurements and so on. That is because using the residuals to estimate
\sigma^2 is not all that accurate, especially for small data sets. In
practice, that recommendation is commonly ignored, but if you do use the
residuals you should keep in mind the potential inaccuracy of the estimate.
I think curve_fit should probably return the scaled covariance and the
estimate of \sigma^2 separately so that folks can follow the recommended
practice if they have decent apriori knowledge of the measurement errors.
<snip>
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20110901/08d15d98/attachment-0001.html
More information about the SciPy-User
mailing list