[SciPy-user] scipy.optimize.leastsq and covariance matrix meaning

Robert Kern robert.kern@gmail....
Tue Nov 11 14:27:11 CST 2008


On Tue, Nov 11, 2008 at 06:15, massimo sandal <massimo.sandal@unibo.it> wrote:
> Robert Kern wrote:
>>
>> On Mon, Nov 10, 2008 at 05:13, massimo sandal <massimo.sandal@unibo.it>
>> wrote:
>>>
>>> massimo sandal wrote:
>>>
>>>> I'll try to sketch up a script reproducing the core of the problem with
>>>> actual data.
>>>
>>> Here it is. Can anyone give it a look to help me understand if and how to
>>> make sense of the covariance matrix?
>>
>> The covariance matrix does need some scaling before it can be
>> interpreted statistically. Basically, if you are doing nonlinear least
>> squares as a statistical procedure, rather than a purely numerical
>> one, the residuals need to be scaled so that they are in units of
>> standard deviations of the measurement error for each individual
>> measurement. If you don't know what that is, then you can estimate it
>> from the fitted residuals. The parameter estimate is unchanged, but
>> you will need to rescale the covariance matrix of the estimate by
>> multiplying it by the residual variance.
>>
>> scipy.odr does most of this for you. Attached is a version of your
>> code using scipy.odr. Here is the text output:
>>
>> Fitted parameters: [  4.90666526e+06   4.78090340e+09]
>> Covariance: [[  1.72438988e+31  -1.64258997e+35]
>>  [ -1.64258997e+35   1.57791262e+39]]
>> Residual variance: 2.83606592894e-22
>> Scaled error bars: [  6.99319913e+04   6.68959208e+08]
>> Scaled covariance: [[  4.89048340e+09  -4.65849344e+13]
>>  [ -4.65849344e+13   4.47506422e+17]]
>
> Thanks a lot! What I need are the scaled error bars, isn't it?

You have a high anti-correlation (-0.996), so it is very much worth
reporting the entire (scaled) covariance matrix. At least report the
scaled error bars and the correlation factor.

> (By the way: any good tutorial reference/book on this kind of numerical
> things? I am a molecular biologist now doing biophysics, and while enjoying
> it a lot, I feel behind on a lot of technical stuff)

Do you mean the statistical aspects of curve fitting? The book I
learned this kind of stuff from is quite old, _Data Reduction and
Error Analysis for the Physical Sciences_ by Bevington and Robinson,
but it covers the practical basics of curve fitting and error analysis
pretty well. Many statistics books cover similar ground, but speak a
different language.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


More information about the SciPy-user mailing list