[Numpy-discussion] polyfit in NumPy v1.7

David Pine djpine@gmail....
Wed Feb 27 14:01:41 CST 2013

Pauli, Josef, Chuck,

I read over the discussion on curve_fit.  I believe I now understand what
people are trying to do when they write about scaling the weighting and/or
covariance matrix.  And I agree that what polyfit does in its current form
is estimate the absolute errors in the data from the data and the fit.

Unfortunately, in the case that you want to supply the absolute uncertainty
estimates, polyfit doesn't leave the option of not making this
"correction".  This is a mistake,

In my opinion, polyfit and curve_fit should be organized with the following

ydata_err  :  array_like or float or None

absolute_err  :  bool (True or False)


If ydata_err is an array, then it has the same length as xdata and ydata
and gives the uncertainty (absolute or relative) of ydata.

If ydata_err is a float (a single value), it gives the uncertainty
(absolute or relative) of ydata.  That is, all ydata points have the same

If ydata_err is None, the ydata_err is set equal to 1 internally.

If absolute_err is True, then no correction is made to the covariance matrix

If absolute_err is False, the a correction is made to the covariance matrix
based on the square of residuals and the value(s) of ydata_err so that it
gives useful estimates of the uncertainties in the fitting parameters.

Finally, I have a quibble with the use of the extra factor of 2 subtracted
in the denominator of  fac (line 596).  This is highly non-standard.  Most
of the literature does not use this factor, and for good reason.  It leads
to complete nonsense when there are only 3 or 4 data points, for one thing.
 In supplying software for general use, the most common standard should be

Finally, the documentation needs to be improved so that it is clear.  I
would be happy to contribute both to improving the documentation and



On Wed, Feb 27, 2013 at 12:47 PM, Pauli Virtanen <pav@iki.fi> wrote:

> 27.02.2013 16:40, David Pine kirjoitti:
> [clip]
> > 2.  I am sorry but I don't understand your response.  The matrix Vbase
> > in the code is already the covariance matrix, _before_ it is scaled by
> > fac.  Scaling it by fac and returning  Vbase*fac as the covariance
> > matrix is wrong, at least according to the references I know, including
> > "Numerical Recipes", by Press et al, "Data Reduction and Error Analysis
> > for the Physical Sciences" by Bevington, both standard works.
> The covariance matrix is (A^T A)^-1 only if the data is weighed by its
> standard errors prior to lstsq. Polyfit estimates the standard errors
> from the fit itself, which results in the `fac` multiplication.
> This is apparently what some people expect. The way the weight
> parameters work is however confusing, as they are
> w[i]=sigma_estimate/sigma[i], rather than being absolute errors.
> Anyway, as Josef noted, it's the same problem that curve_fit in Scipy
> had and probably the same fix needs to be done here.
> --
> Pauli Virtanen
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20130227/1944d1b7/attachment.html 

More information about the NumPy-Discussion mailing list