[SciPy-User] scipy.optimize named argument inconsistency

Charles R Harris charlesr.harris@gmail....
Mon Sep 5 14:42:44 CDT 2011


On Mon, Sep 5, 2011 at 12:17 PM, Christoph Deil <
deil.christoph@googlemail.com> wrote:

>
> On Sep 5, 2011, at 4:27 PM, josef.pktd@gmail.com wrote:
>
> On Mon, Sep 5, 2011 at 10:55 AM, Christoph Deil
> <deil.christoph@googlemail.com> wrote:
>
>
> How can I compute the covariance matrix for a likelihood fit using the
>
> routines in scipy.optimize?
>
>
> 1300 lines in scikits.statsmodels  wrapping some scipy optimizers for
> maximum likelihood estimation
> e.g. LikelihoodModel
>
> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/base/model.py#L81
> GenericLikelihoodModel
>
> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/base/model.py#L421
> in the latter I use loglike and loglikeobs, depending on whether I
> need or want the Jacobian or not.
>
>
> This is a very nice class!
>
> But installing and learning statsmodels takes a while, and wanting
> parameter errors for a likelihood fit
> is a very common case for using an optimizer, so I think it would be
> nice to include that functionality in scipy.optimize itself (see code
> below).
>
>
> As far as I see the only method that returns a covariance matrix (or
> results
>
> from with it is easy to compute the covariance matrix) are leastsq and
>
> curve_fit. The problem with these is that they take a residual vector chi
>
> and internally compute  chi2 = (chi**2).sum() and minimize that. But for a
>
> likelihood fit the quantity to be minimized is logL.sum(), i.e. no squaring
>
> is required, so as far as I can see it is not possible to do a likelihood
>
> fit with leastsq and curve_fit.
>
> On the other hand as Matt pointed out most (all) other optimization methods
>
> (like e.g. fmin) don't do this squaring and summing internally, but the
> user
>
> function does it if one is doing a chi2 fit, so using these it is easy to
> do
>
> a likelihood fit. But it is not possible to get parameter errors because
>
> these other optimizers don't return a Hessian or covariance matrix.
>
> It would be nice if there were a method in scipy.optimize to compute the
>
> covariance matrix for all optimizers.
>
>
> fmin and the other optimizer don't have enough assumptions to figure
> out whether the Hessian is the covariance matrix of the parameters.
> This is only true if the objective function is the loglikelihood. But
> there is no reason fmin and the others should assume this.
>
> We had a brief discussion a while ago on the mailinglist which led to
> the notes in the leastsq docstring
> http://docs.scipy.org/scipy/docs/scipy.optimize.minpack.leastsq/#leastsq
> that cov_x also is not always the (unscaled) covariance matrix.
>
> If the objective function comes from a different estimation method for
> example, then I think it's usually not the case that the (inverse)
> Hessian is the covariance matrix.
>
>
> Actually all that is missing from scipy to be able to get parameter errors
> in a likelihood fit (which includes chi2 fit and I believe is by far the
> most common case) with scipy.optimize (for all optimizers!) is a method to
> compute the Hessian, as already available e.g. in these packages:
>
> http://code.google.com/p/numdifftools/source/browse/trunk/numdifftools/core.py#1134
>
> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/sandbox/regression/numdiff.py#L99
>
> Would it be possible to move one of the approx_hess methods to
> scipy.optimize (there is already a scipy.optimize.approx_fprime)?
>
> Looking through numpy and scipy the only differentiation method I could
> find was
>
> http://docs.scipy.org/doc/scipy/reference/generated/scipy.misc.derivative.html
> which is only 1D, I believe.
>
> """This is the example from the curve_fit docstring"""
> import numpy as np
> from scipy.optimize import curve_fit, fmin
> def func(x, a, b, c):
>     return a * np.exp(-b * x) + c
> p0 = (2.5, 1.3, 0.5)
> x = np.linspace(0, 4, 50)
> y = func(x, *p0)
> np.random.seed(0)
> yn = y + 0.2 * np.random.normal(size=len(x))
> popt, pcov = curve_fit(func, x, yn)
> print 'curve_fit results:'
> print 'values:', popt
> print 'errors:', np.sqrt(pcov.diagonal())
>
> """And here is how to compute the fit parameter values and errors
> using one of the other optimizers (exemplified with fmin) and
> a method to compute the Hesse matrix"""
> def chi2(pars):
>     chi = yn - func(x, *pars)
>     return (chi ** 2).sum()
> popt = fmin(chi2, p0, disp=False)
> from numpy.dual import inv
> from scikits.statsmodels.sandbox.regression.numdiff import approx_hess3 asapprox_hess
> phess = approx_hess(popt, chi2)
> def approx_covar(hess, red_chi2):
>     return red_chi2 * inv(phess / 2.)
> pcov = approx_covar(popt, chi2(popt) / (len(x) - len(p0)))
> print 'curve_fit results:'
> print 'values:', popt
> print 'errors:', np.sqrt(pcov.diagonal())
>
> curve_fit results:
> values: [ 2.80720814  1.24568448  0.44517316]
> errors: [ 0.12563313  0.12409886  0.05875364]
> fmin and approx_hess results:
> values: [ 2.80720337  1.24565936  0.44515526]
> errors: [ 0.12610377  0.12802944  0.05979982]
>
>
>
> And it would be nice if the super-fast Levenberg-Marquard optimizer called
>
> by leastsq were also available for likelihood fits.
>
> Maybe it would be possible to add one method to scipy.optimize to compute
>
> the covariance matrix after any of the optimizers has run,
>
> i.e. the best-fit pars have been determined?
>
> cov = scipy.optimize.cov(func, pars)
>
> I think this can be done by computing the Hessian via finite differences
> and
>
> then inverting it.
>
> leastsq seems to compute the Hessian from a "Jacobian". This is the part I
>
> don't understand, but without going into the details, let me ask this:
>
> Is there something special about the Levenberg-Marquard optimizer that it
>
> requires the individual observations?
>
>
> It uses the outer (?) product of the Jacobian (all observations) to
> find the improving directions, and the outerproduct of the Jacobian is
> a numerical approximation to the Hessian in the
> err = y-f(x,params) or loglikelihood case.
>
> The Hessian calculation can break down quite easily (not positive
> definite because of numerical problems), the product of the Jacobian
> is more robust in my experience and cheaper. (statsmodels
> GenericLikelihood has covariance based on Hessian, Jacobian and a
> sandwich of the two).
>
>
> Its true that there are often numerical problems with the Hessian.
> If the Jacobian method is more robust, then maybe that could be included in
> scipy.optimize instead of or in addition to the Hesse method?
>
>
The Levenberg-Marquardt algorithm uses a parameter linearization, i.e., the
Jacobian, as the design matrix of ordinary least squares together with
adaptive regularization. It is the ordinary least squares approach that
dictates the vector form of the error to be minimized. The (unscaled)
covariance of the parameters is the usual (A^t * A)^{-1) of ordinary least
squares with the Jacobian in place of A.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20110905/c07426c3/attachment-0001.html 


More information about the SciPy-User mailing list