# [SciPy-User] ancova with optimize.curve_fit

josef.pktd@gmai... josef.pktd@gmai...
Thu Dec 2 16:43:16 CST 2010

```On Thu, Dec 2, 2010 at 4:51 PM, Peter Tittmann <ptittmann@gmail.com> wrote:
> here is some of the data. Josef, i'm not sure I understand your suggestion.
> dbh is the dependent variable and is id (actually a list). The independent
> variable is height and the categorical variable to test for covariance with
> is plot.
> Maybe I'm confused and am trying to do something that cant be done this
> way...

I don't understand what your getDiam function is supposed to do. In
ancova, indHtPlot is 2d (nobs,2) with variables dbh and plot. getdiam
calculates [a * dbh**b, a * plot**b] a (nobs,2) array, but it should
produce instead a 1d (nobs,) array.
Maybe you want to sum the functions with dbh and plot for each
observation sum([a * dbh**b, a * plot**b], axis=1)

This should solve the curvefit problem, but if plot is categorical,
then this wouldn't be correct since it is just treated as metric
variable. Maybe you want some dummy variables for plot instead. ???

Josef

> Thanks for any further assistance..
> Peter
>
> --
> Peter Tittmann
>
>
> On Thursday, December 2, 2010 at 1:33 PM, josef.pktd@gmail.com wrote:
>
> On Thu, Dec 2, 2010 at 2:55 PM, Peter Tittmann <ptittmann@gmail.com> wrote:
>
> Greetings,
> Im attempting to conduct analysis of covariance (ANCOVA) using a non-linear
> regression with curve_fit in optimize. The doc string states:
> "xdata : An N-length sequence or an (k,N)-shaped array for functions with k
> predictors. The independent variable where the data is measured."
> I am hoping that this means that if I pass the independent variable and a
> categorical variable, the resulting covariance matrix will reflect the
> variability in the equation coefficients among the categorical variables.
> 1. Is tis the case?
> 2. If so, i'm having a problem with the input array for xdata. The following
> extracts data from a relational database (thats the sql). the eqCoeff()
> function works fine, however when I add a second dimension to the xdata in
> th ancova() function (indHtPlot), curve fit produces an error which seems to
> be related to the structure of my input array. I've tried column_stack and
> vstack to form the arrays. Any assistance would be gratefully received.
>
> import birdseye_db as db
> import numpy as np
> from scipy.optimize import curve_fit
> def getDiam(ht, a, b):
>     dbh = a * ht**b
>     return dbh
>
> if ht is 2dimensional, then dbh is also two dimensional (n,k). there
> should be a reduce, e.g. sum in here so that the return is 1d.
>
> Josef
>
>
> def eqCoeff():
>     '''estimates coefficients a and b in dbh= a* h**b using all trees where
> height was measured'''
>     species=[i[0].strip(' ') for i in db.query('select distinct species from
> plots')]
>     res3d=db.query('select dbh, height, species from plots where ht_code=1')
>     indHt=[i[1] for i in res3d]
>     depDbh=[i[0] for i in res3d]
>     estimated_params, err_est = curve_fit(getDiam, indHt, depDbh)
>     return estimated_params, err_est
>
> def ancova():
>     res=db.query('select dbh, height, plot, species from plots where
> ht_code=1')
>     indHtPlot= np.column_stack(([i[1] for i in res],[i[2] for i in res] ))
>     depDbh=[i[0] for i in res]
>     estimated_params, err_est = curve_fit(getDiam, indHtPlot, depDbh)
>     return estimated_params, err_est
> Thanks in advance
> --
> Peter Tittmann
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
```