[SciPy-user] curve_fit step-size and optimal parameters

josef.pktd@gmai... josef.pktd@gmai...
Wed Jun 10 11:12:51 CDT 2009


On Wed, Jun 10, 2009 at 3:58 AM, Sebastian
Walter<sebastian.walter@gmail.com> wrote:
> If you try to fit the frequency with the least-squares distance  the
> problem is not only nonlinearity
> but rather the fact that the objective functions has many local minimizers.
> At least that's what I have observed in a toy example once.
>
> Has anyone experience what to do in that case? (Maybe use L1 norm instead?)

I would look at estimation in frequency domain, which I know next to
nothing about.

But for your example, I manged to get the estimate either by adding a
bit of noise to your y, or by estimating the constant separately. When
I remove the constant, the indeterminacy (?) in the parameter estimate
went away. Also if there is a small trend then the estimation worked.

The other way, I would try in cases like this would be to use a
penalization term (as in Tychonov or Ridge) in the objective function,
but I didn't try out how well this would work in your case.

Josef


>
>
>
> On Mon, Jun 8, 2009 at 10:19 PM, Robert Kern<robert.kern@gmail.com> wrote:
>> 2009/6/8 Stéfan van der Walt <stefan@sun.ac.za>:
>>> 2009/6/8 Robert Kern <robert.kern@gmail.com>:
>>>> On Mon, Jun 8, 2009 at 14:59, ElMickerino<elmickerino@hotmail.com> wrote:
>>>>> My question is, how can I get curve_fit to use a very small step-size for
>>>>> the phase, or put in strict limits, and to therefore get a robust fit.  I
>>>>> don't want to tune the phase by hand for each of my 60+ datasets.
>>>>
>>>> You really can't. I recommend the A*sin(w*t)+B*cos(w*t)
>>>> parameterization rather than the A*sin(w*t+phi) one.
>>>
>>> Could you expand?  I can't immediately see why the second
>>> parametrisation is bad.
>>
>> The cyclic nature of phi. It complicates things precisely as the OP describes.
>>
>>> Can't a person do this fit using non-linear
>>> least-squares?  Ah, that's probably why you use the other
>>> parametrisation, so that you don't have to use non-linear least
>>> squares?
>>
>> If you aren't also fitting the frequency, then yes. If you are fitting
>> for the frequency, too, the problem is still non-linear.
>>
>> --
-------------- next part --------------
# author: michael a schmidt
# purpose: fit sinusoid with constant offset

from numpy import *
from scipy.optimize import *

def f(x, a, b, c, d):
    return (b*sin(c*x + d)) + a#/(x+1e2) #a*x**2 #
    #return (a + b*sin(c*x + d))
    #return (a + b*sin(c*x)) + d*cos(e*x)

def f2(x, b, c, d):
    return (b*sin(c*x + d))

def fp(y, x, a, b, c, d):
    # penalized objective function, not used
    return np.sum(y - (b*sin(c*x + d))**2) + 1e-4*(a**2 + b**2 + c**2 + d**2)

def do_it():

    x, y = load_data('data_file.txt')
    
    y = (1e6)*y + 0.05*np.random.normal(size=y.shape)

    f_probe = 5.380
    f_pump = 5.408 #Hz
    f_beat = abs(f_probe - f_pump)
    w_beat = 2.*pi*f_beat

    V0 = 0.5*(max(y) + min(y))
    V1 = max(y) - V0
    d0 = 0.0 #initial guess of zero phase
    p0=[V0, V1, w_beat, d0]
    
    popt, pcov = curve_fit(f, x, y, p0=[V0, V1, w_beat, d0])
    
    print "found: V1 = %f +/- %f" % (popt[1], sqrt(pcov[1][1]))
    return popt, x,y, p0, pcov

def do_it2():

    x, y = load_data('data_file.txt')
    
    y = (1e6)*y

    f_probe = 5.380
    f_pump = 5.408 #Hz
    f_beat = abs(f_probe - f_pump)
    w_beat = 2.*pi*f_beat

    V0 = 0.5*(max(y) + min(y))
    V1 = max(y) - V0
    d0 = 0.0 #initial guess of zero phase
    p0=[V0, V1, w_beat, d0]
    
    popt, pcov = curve_fit(f2, x, y-y.mean(), p0=[V1, w_beat, d0])
    aest = (y - f2(x, *popt)).mean()
    popt2 = np.hstack((aest,popt))
    
    print "found: V1 = %f +/- %f" % (popt[1], sqrt(pcov[1][1]))
    return popt2, x,y, p0, pcov

def load_data(in_file):
    input = open(in_file, 'r')
    lines = input.readlines()
    x = array([0.]*len(lines))
    y = array([0.]*len(lines))
    for i, line in enumerate(lines): 
        v1 = float(line.split()[0])
        v2 = float(line.split()[1])
        x[i] = v1
        y[i] = v2
    return (x, y)
    
popt, x,y, p0, pcov = do_it()

import matplotlib.pyplot as plt
yh = f(x, *popt)
plt.figure()
plt.plot(x,y,x,yh)
pstd = np.sqrt(np.diag(pcov))
print pcov/np.outer(pstd,pstd)

popt, x,y, p0, pcov = do_it2()
yh = f2(x, *popt[1:]) + popt[0]
plt.figure()
plt.plot(x,y,x,yh)
#plt.show()
pstd = np.sqrt(np.diag(pcov))
print pcov/np.outer(pstd,pstd)


More information about the SciPy-user mailing list