# [SciPy-User] leastsq - When to scale covariance matrix by reduced chi square for confidence interval estimation

Gregor Thalhammer gregor.thalhammer@gmail....
Fri Jun 1 07:21:15 CDT 2012

Am 1.6.2012 um 11:21 schrieb Markus Baden:

> Hi Gregor,
>
> Thanks for the fast reply.
>
> If you have knowledge about the statistical errors of your data, then skipping step 2 and 3 is the recommended, and you can use the chi square to assess the validity of the fit and your assumptions about the errors. On the other hand, if you have insufficient knowledge about the errors, you can use the reduced chi square as an estimate for the variance of your data (at least under the assumption that the error is the same for all data points). This is the idea behind steps 2 and 3.
>
> I just want to get that straight: So basically in the case where I either don't have errors, or I don't trust them, multiplying the covariance by the reduced chi square amounts to "normalizing" the covariance such that the fit would have a chi square of one (?). Maybe your point could go into the docs for curve_fit... or there could be a comment about standard procedure a bit like in origin (http://www.originlab.com/www/support/resultstech.aspx?ID=452&language=English&Version=All)

Yes, I think you correctly got the idea.

>
>
> > Now in the particular problem I am working at, we have a couple of fits like [5] and some of them have a slightly worse reduced chi square of say about 1.4 or 0.7. At this point the two methods start to deviate and I am wondering which would be the correct way of quoting the errors estimated from the fit. Even a basic reference to some text book that explains the method used in scipy would be very helpful.
>
> I didn't look at your data, but I guess that these values of the reduced chi square are still in range such that they are not a significant deviation from the expected value of one. The chi-squared distribution is rather broad. So I would omit steps 2 and 3. Only if you have good reasons not to trust your assumptions about the errors of the data, then apply steps 2 and 3.
>
> We looked at which part of the CDF these values are and they are still ok. And our errors are all inferred from measurements, so we trust them quite a bit. We use the fitting described to obtain a particular property of an ion via spectroscopy... that's also why we want to get our errors on that property correct :)
>

As e.g. described nicely in http://mljohnson.pharm.virginia.edu/pdfs/174.pdf you have to be careful about the parameter error estimates obtained by this procedure. In general the obtained results are too optimistic.

Gregor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20120601/0078ac74/attachment-0001.html