[Numpy-discussion] polyfit in NumPy v1.7
Wed Feb 27 11:01:05 CST 2013
Please post inline so we have the context.
On Wed, Feb 27, 2013 at 9:40 AM, David Pine <firstname.lastname@example.org> wrote:
> Thanks for the quick reply.
> 1. I see your point about zero weights but the code in its current form
> doesn't take into account zero weights in counting the degrees of freedom,
> as you point out, so it seems to me like a moot point. More importantly,
> the documentation doesn't explain what the weights variable w is. In some
> implementations of least square fitting it is 1/sigma^2 while in this one
> it's 1/sigma. At the very least, the documentation should be more explicit.
> How does one go about changing the documentation?
weighted least squares, multiplied by weights, is useful and used in
robust regression for example with downweighting of observations, so
it's not "moot" (statsmodels RLM)
If it's called weights, I would expect multiplication. If it's called
sigma, I would expect division.
> 2. I am sorry but I don't understand your response. The matrix Vbase in
> the code is already the covariance matrix, _before_ it is scaled by fac.
> Scaling it by fac and returning Vbase*fac as the covariance matrix is
> wrong, at least according to the references I know, including "Numerical
> Recipes", by Press et al, "Data Reduction and Error Analysis for the
> Physical Sciences" by Bevington, both standard works.
Sounds like the same discussion we have about the scaling of the
covariance of the parameter estimates in scipy.optimize.curvefit.
https://github.com/scipy/scipy/pull/448 for the latest incarnation.
(my impression is roughly: Some physical sciences know the scale of
their sigma, the rest of the statistical world does not.)
> On Wed, Feb 27, 2013 at 9:03 AM, Charles R Harris
> <email@example.com> wrote:
>> On Wed, Feb 27, 2013 at 6:46 AM, David Pine <firstname.lastname@example.org> wrote:
>>> As of NumPy v1.7, numpy.polyfit includes an option for providing
>>> weighting to data to be fit. It's a welcome addition, but the
>>> implementation seems a bit non-standard, perhaps even wrong, and I wonder if
>>> someone can enlighten me.
>>> 1. The documentation does not specify what the weighting array "w" is
>>> supposed to be. After some fooling around I figured out that it is 1/sigma,
>>> where sigma is the standard deviation uncertainty "sigma" in the y data. A
>>> better implementation, which would be consistent with how weighting is done
>>> in scipy.optimize.curve_fit, would be to specify "sigma" directly, rather
>>> than "w". This is typically what the user has at hand, this syntax is more
>>> transparent, and it would make polyfit consistent with the nonlinear fitting
>>> routine scipy.optimize.curve_fit.
>> Weights can be used for more things than sigma. Another common use is to
>> set the weight to zero for bad data points. Zero is a nicer number than inf,
>> although that would probably work for ieee numbers. Inf and 0/inf == 0 are
>> modern innovations, so multiplication weights are somewhat traditional.
>> Weights of zero do need to be taken into account in counting the degrees of
>> freedom however.
>>> 2. The definition of the covariance matrix in polyfit is peculiar (or
>>> maybe wrong, I'm not sure). Towards the end of the code for polyfit, the
>>> standard covariance matrix is calculated and given the variable name
>>> "Vbase". However, instead of returning Vbase as the covariance matrix,
>>> polyfit returns the quantity Vbase * fac, where fac = resids / (len(x) -
>>> order - 2.0). fac scales as N, the number of data points, so the covariance
>>> matrix is about N times larger than it should be for large values of N; the
>>> increase is smaller for small N (of order 10). Perhaps the authors of
>>> polyfit want to use a more conservative error estimate, but the references
>>> given in the poyfit code make no mention of it. I think it would be much
>>> better to return the standard definition of the covariance matrix (see, for
>>> example, the book Numerical Recipes).
>> The resids scale as N, so fac is approximately constant. The effect of
>> more data points shows up in (A^T * A)^-1, which is probably what is called
>> covariance, but really isn't until it is scaled by fac.
>> NumPy-Discussion mailing list
> NumPy-Discussion mailing list
More information about the NumPy-Discussion