[SciPy-User] Curve fitting questions
Tue Oct 19 21:00:02 CDT 2010
On Tue, Oct 19, 2010 at 4:17 PM, <email@example.com> wrote:
> On Tue, Oct 19, 2010 at 3:20 PM, Gökhan Sever <firstname.lastname@example.org> wrote:
>> On Tue, Oct 19, 2010 at 1:04 PM, <email@example.com> wrote:
>>> I think you could take logs and you would have a linear function in
>>> param=log(x), and you could use linalg to solve for param, and then
>>> transform back exp(param). Or this would give you a starting value if
>>> you want the non-linear optimization.
>> Using 3 and 5 data-points the curve_fit usually does a good job, even without
>> the initial estimates provided. When it's necessary we usually constrain the
>> initial parameters with max CCN concentration for C (param) and a typical
>> k (param) values.
>> This even works with 2 data-points:
>> I: ccn_ss1 = [0.27, 0.34]
>> I: ccn_conc1 = np.array([383.51237409766452, 424.82669523141652])
>> I: tfit2, pcov2 = curve_fit(my_ck, ccn_ss1, ccn_conc1, p0=(424,
>> 0.5), ftol=1)
>> provides me reasonable estimations. However, having another data-point
>> would surely
>> improve the quality of the fit and estimations.
>>>> ccn_ss1 = 0.27
>>>> ccn_conc1 = 383.51237409766452
>>>> # One data point estimation fails with IndexError: index out of range for array
>>>> tfit3, pcov3 = curve_fit(my_ck, ccn_ss1, ccn_conc1, p0=tfit1, ftol=1)
>>> If you have one parameter to estimate and only one observations, then
>>> you should be able to solve it exactly with one of the solvers/
>>> rootfinders in scipy. optimize.
>> I want to estimate two parameters using one observation (which is a
>> data-pair for my case --one for ccn_ss1 and one for ccn_conc1.)
>> Probably, in this current version fsolve can't do give me any roots.
> I remembered curve_fit wrongly, I didn't remember it switched data and
> parameters in the argument list compare to leastsq.
> I need to reread your example (later today).
> Taking logs and using linalg is still more efficient (unless you
> insist on an additive error term).
>>> ccn_ss1 = [0.27, 0.34, 0.57]
>>> ccn_conc1 = np.array([383.51237409766452, 424.82669523141652, 511.48197391304342])
>>> def my_ck(x, a, b):
>>> tfit1, pcov1 = curve_fit(my_ck, ccn_ss1, ccn_conc1)
array([ 6.33851519e+02, 3.78527717e-01])
>>> stats.linregress(np.log(ccn_ss1), np.log(ccn_conc1))
(0.38096158507713485, 6.4541006630438478, 0.99864456413652103,
>>> stats.linregress(np.log(ccn_ss1[:-1]), np.log(ccn_conc1[:-1]))
(0.44381311635631338, 6.5304711876039025, 1.0, nan, nan)
>>> stats.linregress(np.log(ccn_ss1[:-2]), np.log(ccn_conc1[:-2]))
(nan, nan, 0.0, nan, nan)
strangely leastsq/curve_fit has a better fit than linregress for exact
solution (2 observations)
>>> tfit1, pcov1 = curve_fit(my_ck, ccn_ss1[:-1], ccn_conc1[:-1], p0=(1,1))
>>> my_ck(ccn_ss1[:-1], *tfit1)
array([ 383.5123741 , 424.82669523])
>>> my_ck(ccn_ss1[:-1], *tfit1) - ccn_conc1[:-1]
array([ 0., 0.])
>>> my_ck(np.asarray(ccn_ss1[:-1]), np.exp(6.5304711876039025), 0.44381311635631338) - ccn_conc1[:-1]
array([ 1.70530257e-13, 3.41060513e-13])
If you have reasonably good information about the function or the
range of starting values, then this always works better and faster for
non-linear optimization. An interesting alternative that James was
using for distribution estimation, is to use a global optimizer
(differential evolution) in combination with a non-linear optimizer.
You could also just draw several random starting values. Since your
optimization problem is very small, it would still be fast.
>> def my_ck(x, a, b):
>> return a*x**b
>> fsolve(my_ck, x0=tfit1, args=(ccn_ss1, ccn_conc1), xtol=1)
>> rather gives a couple of overflow warnings:
>> Warning: overflow encountered in power
>> In one data-pair situation my function looks like:
>> a*x**b = 383.5
>> Now there are two unknowns providing the x as ccn_ss1 as a*0.27**b =
>> 383.5. I should make one more assumption otherwise it is still
>> unsolvable. Probably making an assumption for a, then I can hand solve
>> this easily. OK, with a = 350 assumption in 350*0.27**b == 383.5, here
>> solving for b results ~ -0.065
>> With some modifications on the original fitfunc:
>> def my_ck(x):
>> return 350*0.27**x - 383
>> fsolve nicely estimates what I want.
>> fsolve(my_ck, x0=0.5)
>> SciPy-User mailing list
More information about the SciPy-User