Polyfit

Charles R Harris charlesr.harris at gmail.com
Thu Oct 12 18:04:15 CDT 2006


On 10/12/06, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
>
>
> On 10/12/06, Charles R Harris <charlesr.harris at gmail.com> wrote:
> >
> >
> >
> > On 10/12/06, Greg Willden < gregwillden at gmail.com> wrote:
> > >
> > > On 10/12/06, Charles R Harris <charlesr.harris at gmail.com > wrote:
> > > >
> > > > I'm guessing that the rcond number in the lstsq version (default
> > > > 1e-10) is the difference. Generally the lstsq version should work better
> > > > than the MPL version because at*a is not as well conditioned and vandermonde
> > > > matrices are notoriously ill conditioned anyway for higher degree fits. It
> > > > would help if you attached the data files in ascii form unless they happen
> > > > to contain thousands of data items. Just the x will suffice, zip it up if
> > > > you have to.
> > > >
> > >
> > >
> > > Here are the files.
> > >
> > > Since the two algorithms behave differently and each has it place then
> > > can both be included in numpy?
> > > i.e. numpy.polyfit(x,y,N, mode='alg1')
> > > numpy.polyfit (x,y,N, mode='alg2')
> > >
> > > replacing alg1 and alg2 with meaningful names.
> > >
> >
> > The polyfit function looks seriously busted. If I do the fits by hand I
> > get the same results using the (not so hot) MPL version or lstsq. I don't
> > know what the problem is. The docstring is also incorrect for the method.
> > Hmmm...
> >
>
> Polyfit seems overly conservative in its choice of rcond.
>
> In [101]: lin.lstsq(v,y,1e-10)[0]
> Out[101]:
> array([  5.84304475e-07,  -5.51513630e-03,   1.51465472e+01,
>          3.05631361e-02])
> In [107]: polyfit(x,y,3)
> Out[108]:
> array([  5.84304475e-07,  -5.51513630e-03,   1.51465472e+01,
>          3.05631361e-02])
>
> Compare
>
> In [112]: lin.lstsq(v,y,1e-12)[0]
> Out[112]:
> array([ -5.42970700e-07,   1.61425067e-03,   1.99260667e+00,
>          6.51889107e+03])
>
> In [113]: dot(lin.inv(vtv),dot(v.T,y))
> Out[113]:
> array([ -5.42970700e-07,   1.61425067e-03,   1.99260667e+00,
>          6.51889107e+03])
>
> So the default needs to be changed somewhere. Probably polyfit shoud
> accept rcond as a keyword. Where the actual problem lies is a bit obscure as
> the normal rcond default for lin.lstsq is 1e-12. Maybe some sort of import
> error somewhere down the line.
>

And here is the location of the problem in numpy/linalg/linalg.py :

def lstsq(a, b, rcond=1.e-10):

The 1e-10 is a bit conservative. On the other hand, I will note that the
condition number of the dot(V^T ,V) matrix is somewhere around 1e22, which
means in general terms that you need around 22 digits of accuracy. Inverting
it only works sorta by accident in the current case. Generally, using
Vandermonde matrices and polynomial fits it a bad idea when the dynamic
range of the interval gets large and the degree gets up around 4-5 as it
leads to ill conditioned sets of equations. When you really need the best
start with chebychev polynomials or, bestest, compute a set of polynomials
orthogonal over the sample points. Anyway, I think rcond should be something
like 1e-12 or 1e-13 by default and be available as a keyword in the polyfit
function. If no one complains I will make this change, although it is just a
bandaid and things will fall apart again as soon as you call polyfit(x,y,4).

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20061012/34ee87ad/attachment.html 
-------------- next part --------------
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
-------------- next part --------------
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


More information about the Numpy-discussion mailing list