[SciPy-User] scipy.optimize named argument inconsistency
josef.pktd@gmai...
josef.pktd@gmai...
Mon Sep 5 14:56:24 CDT 2011
On Mon, Sep 5, 2011 at 2:17 PM, Christoph Deil
<deil.christoph@googlemail.com> wrote:
>
> On Sep 5, 2011, at 4:27 PM, josef.pktd@gmail.com wrote:
>
> On Mon, Sep 5, 2011 at 10:55 AM, Christoph Deil
> <deil.christoph@googlemail.com> wrote:
>
> How can I compute the covariance matrix for a likelihood fit using the
>
> routines in scipy.optimize?
>
> 1300 lines in scikits.statsmodels wrapping some scipy optimizers for
> maximum likelihood estimation
> e.g. LikelihoodModel
> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/base/model.py#L81
> GenericLikelihoodModel
> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/base/model.py#L421
> in the latter I use loglike and loglikeobs, depending on whether I
> need or want the Jacobian or not.
>
> This is a very nice class!
> But installing and learning statsmodels takes a while, and wanting parameter
> errors for a likelihood fit
> is a very common case for using an optimizer, so I think it would be nice to
> include that functionality in scipy.optimize itself (see code below).
>
> As far as I see the only method that returns a covariance matrix (or results
>
> from with it is easy to compute the covariance matrix) are leastsq and
>
> curve_fit. The problem with these is that they take a residual vector chi
>
> and internally compute chi2 = (chi**2).sum() and minimize that. But for a
>
> likelihood fit the quantity to be minimized is logL.sum(), i.e. no squaring
>
> is required, so as far as I can see it is not possible to do a likelihood
>
> fit with leastsq and curve_fit.
>
> On the other hand as Matt pointed out most (all) other optimization methods
>
> (like e.g. fmin) don't do this squaring and summing internally, but the user
>
> function does it if one is doing a chi2 fit, so using these it is easy to do
>
> a likelihood fit. But it is not possible to get parameter errors because
>
> these other optimizers don't return a Hessian or covariance matrix.
>
> It would be nice if there were a method in scipy.optimize to compute the
>
> covariance matrix for all optimizers.
>
> fmin and the other optimizer don't have enough assumptions to figure
> out whether the Hessian is the covariance matrix of the parameters.
> This is only true if the objective function is the loglikelihood. But
> there is no reason fmin and the others should assume this.
>
> We had a brief discussion a while ago on the mailinglist which led to
> the notes in the leastsq docstring
> http://docs.scipy.org/scipy/docs/scipy.optimize.minpack.leastsq/#leastsq
> that cov_x also is not always the (unscaled) covariance matrix.
>
> If the objective function comes from a different estimation method for
> example, then I think it's usually not the case that the (inverse)
> Hessian is the covariance matrix.
>
> Actually all that is missing from scipy to be able to get parameter errors
> in a likelihood fit (which includes chi2 fit and I believe is by far the
> most common case) with scipy.optimize (for all optimizers!) is a method to
> compute the Hessian, as already available e.g. in these packages:
> http://code.google.com/p/numdifftools/source/browse/trunk/numdifftools/core.py#1134
> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/sandbox/regression/numdiff.py#L99
> Would it be possible to move one of the approx_hess methods to
> scipy.optimize (there is already a scipy.optimize.approx_fprime)?
> Looking through numpy and scipy the only differentiation method I could find
> was
> http://docs.scipy.org/doc/scipy/reference/generated/scipy.misc.derivative.html
> which is only 1D, I believe.
> """This is the example from the curve_fit docstring"""
> import numpy as np
> from scipy.optimize import curve_fit, fmin
> def func(x, a, b, c):
> return a * np.exp(-b * x) + c
> p0 = (2.5, 1.3, 0.5)
> x = np.linspace(0, 4, 50)
> y = func(x, *p0)
> np.random.seed(0)
> yn = y + 0.2 * np.random.normal(size=len(x))
> popt, pcov = curve_fit(func, x, yn)
> print 'curve_fit results:'
> print 'values:', popt
> print 'errors:', np.sqrt(pcov.diagonal())
> """And here is how to compute the fit parameter values and errors
> using one of the other optimizers (exemplified with fmin) and
> a method to compute the Hesse matrix"""
> def chi2(pars):
> chi = yn - func(x, *pars)
> return (chi ** 2).sum()
> popt = fmin(chi2, p0, disp=False)
> from numpy.dual import inv
> from scikits.statsmodels.sandbox.regression.numdiff import approx_hess3 as
> approx_hess
> phess = approx_hess(popt, chi2)
> def approx_covar(hess, red_chi2):
> return red_chi2 * inv(phess / 2.)
> pcov = approx_covar(popt, chi2(popt) / (len(x) - len(p0)))
I don't quite see where the normalizations, e.g. /2., are coming from.
However, I never tried to use the Hessian with leastsq, and I would
have to look at this.
> print 'curve_fit results:'
> print 'values:', popt
> print 'errors:', np.sqrt(pcov.diagonal())
> curve_fit results:
> values: [ 2.80720814 1.24568448 0.44517316]
> errors: [ 0.12563313 0.12409886 0.05875364]
> fmin and approx_hess results:
> values: [ 2.80720337 1.24565936 0.44515526]
> errors: [ 0.12610377 0.12802944 0.05979982]
The attachment is a quickly written script to use the
GenericLikelihoodModel. The advantage of using statsmodels is that you
are (supposed to be) getting all additional results that are available
for Maximum Likelihood Estimation.
e.g. my version of this with stats.norm.pdf. I took the pattern
partially from the miscmodel that uses t distributed errors. It's
supposed to be easy to switch distributions.
-----------
class MyNonLinearMLEModel(NonLinearMLEModel):
'''Maximum Likelihood Estimation of Linear Model with nonlinear function
This is an example for generic MLE.
Except for defining the negative log-likelihood method, all
methods and results are generic. Gradients and Hessian
and all resulting statistics are based on numerical
differentiation.
'''
def _predict(self, params, x):
a, b, c = params
return a * np.exp(-b * x) + c
mod = MyNonLinearMLEModel(yn, x)
res = mod.fit(start_params=[1., 1., 1., 1.])
print 'true parameters'
print np.array(list(p0)+[0.2])
print 'parameter estimates'
print res.params
print 'standard errors, hessian, jacobian, sandwich'
print res.bse
print res.bsejac
print res.bsejhj
-------
curve_fit results:
values: [ 2.80720815 1.24568449 0.44517316]
errors: [ 0.12563313 0.12409886 0.05875364]
Optimization terminated successfully.
Current function value: -7.401006
Iterations: 141
Function evaluations: 247
true parameters
[ 2.5 1.3 0.5 0.2]
parameter estimates
[ 2.80725714 1.24569287 0.44515168 0.20867979]
standard errors, hessian, jacobian, sandwich
[ 0.12226468 0.12414298 0.05798039 0.02086842]
[ 0.15756494 0.15865264 0.06242619 0.02261329]
[ 0.09852726 0.10175901 0.05740336 0.01994902]
>>> res.aic
-6.8020120409185836
>>> res.bic
0.84607998079400026
>>> res.t_test([0,1,0,0],1)
Traceback (most recent call last):
File "<pyshell#7>", line 1, in <module>
res.t_test([0,1,0,0],1)
File "E:\Josef\eclipsegworkspace\statsmodels-git\statsmodels-josef\scikits\statsmodels\base\model.py",
line 1020, in t_test
df_denom=self.model.df_resid)
AttributeError: 'MyNonLinearMLEModel' object has no attribute 'df_resid'
>>> res.model.df_resid = len(yn) - len(p0)
>>> res.t_test([0,1,0,0],1)
<scikits.statsmodels.stats.contrast.ContrastResults object at 0x04C60FB0>
>>> print res.t_test([0,1,0,0],1)
<T test: effect=array([ 1.24569287]), sd=array([[ 0.12414298]]),
t=array([[ 1.97911215]]), p=array([[ 0.02683928]]), df_denom=47>
>>>
GenericLikelihoodModel is still not polished, df_resid is not defined
generically. bootstrap raised an exception.
I agree that scipy should have numerical differentiation, jacobian
Hessian like numdifftools.
Josef
>
>
> And it would be nice if the super-fast Levenberg-Marquard optimizer called
>
> by leastsq were also available for likelihood fits.
>
> Maybe it would be possible to add one method to scipy.optimize to compute
>
> the covariance matrix after any of the optimizers has run,
>
> i.e. the best-fit pars have been determined?
>
> cov = scipy.optimize.cov(func, pars)
>
> I think this can be done by computing the Hessian via finite differences and
>
> then inverting it.
>
> leastsq seems to compute the Hessian from a "Jacobian". This is the part I
>
> don't understand, but without going into the details, let me ask this:
>
> Is there something special about the Levenberg-Marquard optimizer that it
>
> requires the individual observations?
>
> It uses the outer (?) product of the Jacobian (all observations) to
> find the improving directions, and the outerproduct of the Jacobian is
> a numerical approximation to the Hessian in the
> err = y-f(x,params) or loglikelihood case.
>
> The Hessian calculation can break down quite easily (not positive
> definite because of numerical problems), the product of the Jacobian
> is more robust in my experience and cheaper. (statsmodels
> GenericLikelihood has covariance based on Hessian, Jacobian and a
> sandwich of the two).
>
> Its true that there are often numerical problems with the Hessian.
> If the Jacobian method is more robust, then maybe that could be included in
> scipy.optimize instead of or in addition to the Hesse method?
>
> But I never looked at the internals of minpack.
>
> Josef
>
>
>
> Or is it just that the current implementation of _minpack._lmdif (which is
>
> called by leastsq) was written such that it works this way (i.e. includes a
>
> computation of cov_x in addition to x) and it could also be written to take
>
> a scalar func like all the other optimizers?
>
> Christoph
>
>
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: try_lonlin.py
Type: text/x-python
Size: 2660 bytes
Desc: not available
Url : http://mail.scipy.org/pipermail/scipy-user/attachments/20110905/6d198f24/attachment.py
More information about the SciPy-User
mailing list