[SciPy-User] Unexpected covariance matrix from scipy.optimize.curve_fit

Matt Newville newville@cars.uchicago....
Fri Sep 2 01:43:13 CDT 2011

Hi Cristoph,

On Thu, Sep 1, 2011 at 4:29 PM, Christoph Deil
<deil.christoph@googlemail.com> wrote:
> On Sep 1, 2011, at 9:45 PM, Matt Newville wrote:
>> Christoph,
>> I ran some tests with the scaling of the covariance matrix from
>> scipy.optimize.leastsq.  Using lmfit-py, and scipy.optimize.leastsq, I
>> did fits with sample function of a Gaussian + Line + Simulated Noise
>> (random.normal(scale=0.023).  The "data" had 201 points, and the fit
>> had 5 variables.  I ran fits with covariance scaling on and off, and
>> "data sigma" = 0.1, 0.2, and 0.5.  The full results are below, and the
>> code to run this is
>>   https://github.com/newville/lmfit-py/blob/master/tests/test_covar.py
>> You'll need the latest git version of lmfit-py for this to be able to
>> turn on/off the covariance scaling.
>> As you can see from the results below, by scaling the covariance, the
>> estimated uncertainties in the parameters are independent of sigma,
>> the estimated uncertainty in the data.  In many cases this is a
>> fabulously convenient feature.
>> As expected, when the covariance matrix is not scaled by sum_sqr /
>> nfree, the estimated uncertainties in the variables depends greatly on
>> the value of sigma.  For the "correct" sigma of 0.23 in this one test
>> case, the scaled and unscaled values are very close:
>>   amp_g = 20.99201 +/- 0.05953 (unscaled) vs +/- 0.05423 (scaled) [True=21.0]
>>   cen_g =  8.09857 +/- 0.00435 (unscaled) vs +/- 0.00396 (scaled) [True= 8.1]
>> and so on.  The scaled uncertainties appear to be about 10% larger
>> than the unscaled ones.  Since this was just one set of data (ie, one
>> set of simulated noise), I'm not sure whether this difference is
>> significant or important to you.  Interestingly, these two cases are
>> in better agreement than comparing sigma=0.20 and sigma=0.23 for the
>> unscaled covariance matrix.
>> For myself, I much prefer having estimated uncertainties that may be
>> off by 10% than being expected to know the uncertainties in the data
>> to 10%.  But then, I work in a field where we have lots of data and
>> systematic errors in collection and processing swamp any simple
>> estimate of the noise in the data.
>> As a result, I think the scaling should stay the way it is.
> I think there are use cases for scaling and for not scaling.
> Adding an option to scipy.optimize.curve_fit, as you did for lmfit is a nice solution.
> Returning the scaling factor s = chi2 / ndf (or chi2 and ndf independently) would be another option to let the user decide what she wants.
> The numbers you give in your example are small because your chi2 / ndf is approximately one, so your scaling factor is approximately one.

Ah, right.  I see your point.  Scaling the covariance matrix is
equivalent to asserting that the fit is good (reduced chi_square = 1),
and so scaling sigma such that this is the case, and getting the
parameter uncertainties accordingly.  This, and the fact that reduced
chi_square was slightly less than 1 (0.83) in the example I gave,
explains the ~10% difference in uncertainties.  But again, that's an
idealized case.

> If the model doesn't represent the data well, then chi2 / ndf is larger than one and the differences in estimated parameter errors become larger.

I think the  question is: should the estimated uncertainties reflect
the imperfection of the model?  I can see merit in both methods (which
differ by a factor of sqrt(reduced_chi_square)):

   a) Given an estimate of sigma, estimate the uncertainties.  Use
unscaled covariance.
   b) Assert that this is a "good fit",  estimate the uncertainties.
Use scaled covariance.

That is, one might have a partial estimate of sigma, and use reduced
chi_square maritally to assess how good this is.

> IMO if the user does give sigmas to curve_fit, it means that she has reason to believe that these are the errors on the data points
> and thus the default should be to not apply the scaling factor in that case.
> On the other hand at the moment the scaling factor is always applied, so having a keyword option
> scale_covariance=True as default means backwards compatibility.

I believe you're proposing that default behavior should be "if sigma
is given, use method a, otherwise use method b".  I think that's
reasonable, but don't have a strong opinion.  I'm not sure I see a
case for changing curve_fit, but I'm not committed to the behavior of
lmfit-py.  Perhaps the best thing to do would be to leave covariance
matrix unscaled, but scale the estimated uncertainties as you propose.


--Matt Newville <newville at cars.uchicago.edu> 630-252-0431

More information about the SciPy-User mailing list