[SciPy-User] Curve fitting questions

Gökhan Sever gokhansever@gmail....
Wed Oct 20 13:13:41 CDT 2010


On Tue, Oct 19, 2010 at 9:00 PM,  <josef.pktd@gmail.com> wrote:
>>>> ccn_ss1 = [0.27, 0.34, 0.57]
>>>> ccn_conc1 = np.array([383.51237409766452, 424.82669523141652, 511.48197391304342])
>
>>>> def my_ck(x, a, b):
>   return a*x**b
>
>>>> tfit1, pcov1 = curve_fit(my_ck, ccn_ss1, ccn_conc1)
>>>> tfit1
> array([  6.33851519e+02,   3.78527717e-01])
>
>
>>>> stats.linregress(np.log(ccn_ss1), np.log(ccn_conc1))
> (0.38096158507713485, 6.4541006630438478, 0.99864456413652103,
> 0.033150010788760682, 0.019855348412039904)
>>>> np.exp(6.4541006630438478)
> 635.30211858377766
>>>> stats.linregress(np.log(ccn_ss1[:-1]), np.log(ccn_conc1[:-1]))
> (0.44381311635631338, 6.5304711876039025, 1.0, nan, nan)
>>>> np.exp(6.5304711876039025)
> 685.72123873003966
>>>> stats.linregress(np.log(ccn_ss1[:-2]), np.log(ccn_conc1[:-2]))
> (nan, nan, 0.0, nan, nan)

Now that it makes much more sense when you demonstrate your words with
code :) I have never approach this question with linregress approach.
Thanks for the demo.

I guess there is no way to make such regression using only one
data-pair. In my case I have proxy approaches that require to make
further assumptions using data from different sections of my analysis.

>
> strangely leastsq/curve_fit has a better fit than linregress for exact
> solution (2 observations)
>
>>>> tfit1, pcov1 = curve_fit(my_ck, ccn_ss1[:-1], ccn_conc1[:-1], p0=(1,1))
>>>> my_ck(ccn_ss1[:-1], *tfit1)
> array([ 383.5123741 ,  424.82669523])
>>>> my_ck(ccn_ss1[:-1], *tfit1) - ccn_conc1[:-1]
> array([ 0.,  0.])
>>>> my_ck(np.asarray(ccn_ss1[:-1]), np.exp(6.5304711876039025), 0.44381311635631338) - ccn_conc1[:-1]
> array([  1.70530257e-13,   3.41060513e-13])
>

The error is very negligible. So far I have good results from
curve_fit and leastsq functions. I use linregress for mostly obtaining
linear fit parameters and r^2 values.


> If you have reasonably good information about the function or the
> range of starting values, then this always works better and faster for
> non-linear optimization. An interesting alternative that James was
> using for distribution estimation, is to use a global optimizer
> (differential evolution) in combination with a non-linear optimizer.
> You could also just draw several random starting values. Since your
> optimization problem is very small, it would still be fast.

Could you give an example for this approach?

Thanks.

-- 
Gökhan


More information about the SciPy-User mailing list