[SciPy-User] Revisit Unexpected covariance matrix from scipy.optimize.curve_fit
Fri Feb 22 12:33:07 CST 2013
On Fri, Feb 22, 2013 at 1:27 PM, Tom Aldcroft
> The 0.11 documentation on curve_fit says:
> sigma : None or N-length sequence
> If not None, it represents the standard-deviation of ydata. This
> vector, if given, will be used as weights in the least-squares
> problem.
> It unambiguously states that sigma is the standard deviation of ydata,
> which is different from a relative weight. That gives a clear
> implication that increasing the standard deviation of all the data
> points by some factor should change the parameter covariance.
> Can the doc string be changed to say "If not None, it represents the
> relative weighting of data points." I would say that most astronomers
> and physicists are likely to be tripped up by this otherwise because
> "sigma" has such a well-understood meaning.
I agree that this is a very misleading, and should be changed.
documentation editor or pull requests are available to change this.
Josef
> On Fri, Feb 22, 2013 at 1:03 PM, Pierre Barbier de Reuille
>> I don't know about this result I must say, do you have a reference?
>> But intuitively, perr shouldn't change when applying the same weight to all
>> the values.
>>
>> Barbier de Reuille Pierre
>>
>>> >
>>> > In Aug 2011 there was a thread [Unexpected covariance matrix from
>>> > scipy.optimize.curve_fit](http://mail.scipy.org/pipermail/scipy-
>>> > user/2011-August/030412.html)
>>> > where Christoph Deil reported that "scipy.optimize.curve_fit returns
>>> > parameter errors that don't scale with sigma, the standard deviation
>>> > of ydata, as I expected." Today I independently came to the same
>>> > conclusion.
>>> >
>>> > This thread generated some discussion but seemingly no agreement that
>>> > the covariance output of `curve_fit` is not what would be expected. I
>>> > think the discussion wasn't as focused as possible because the example
>>> > was too complicated. With that I provide here about the simplest
>>> > possible example, which is fitting a constant to a constant dataset,
>>> > aka computing the mean and error on the mean. Since we know the
>>> > answers we can compare the output of `curve_fit`.
>>> >
>>> > To illustrate things more easily I put the examples into an IPython
>>> > notebook which is available at:
>>> >
>>> > http://nbviewer.ipython.org/5014170/
>>> >
>>> > This was run using scipy 0.11.0 by the way. Any further discussion on
>>> > this topic to come to an understanding of the covariance output from
>>> > `curve_fit` would be appreciated.
>>> >
>>> > Thanks,
>>> > Tom
>>> chi2 = np.sum(((yn-const(x, *popt))/sigma)**2)
>>> perr = np.sqrt(np.diag(pcov)/(chi2/(x.shape[0]-1)))
>>>
>>> Perr is then the actual error in the fit parameter. No?
>>>
>>> -Eric
