[SciPy-User] scipy.optimize named argument inconsistency

josef.pktd@gmai... josef.pktd@gmai...
Mon Sep 5 10:27:53 CDT 2011


On Mon, Sep 5, 2011 at 10:55 AM, Christoph Deil
<deil.christoph@googlemail.com> wrote:
>
> On Sep 5, 2011, at 2:23 PM, josef.pktd@gmail.com wrote:
>
> On Sun, Sep 4, 2011 at 8:44 PM, Matthew Newville
> <matt.newville@gmail.com> wrote:
>
> Hi,
>
> On Friday, September 2, 2011 1:31:46 PM UTC-5, Denis Laxalde wrote:
>
> Hi,
>
> (I'm resurrecting an old post.)
>
> On Thu, 27 Jan 2011 18:54:39 +0800, Ralf Gommers wrote:
>
> On Wed, Jan 26, 2011 at 12:41 AM, Joon Ro <joon...@gmail.com> wrote:
>
> I just found that for some functions such as fmin_bfgs, the argument
>
> name
>
> for the objective function to be minimized is f, and for others such
>
> as
>
> fmin, it is func.
>
> I was wondering if this was intended, because I think it would be
>
> better to
>
> have consistent argument names across those functions.
>
>
> It's unlikely that that was intentional. A patch would be welcome.
>
> "func"
>
> looks better to me than "f" or "F".
>
> There are still several inconsistencies in input or output of functions
>
> in the optimize package. For instance, for input parameters the Jacobian
>
> is sometimes name 'fprime' or 'Dfun', tolerances can be 'xtol' or
>
> 'x_tol', etc. Outputs might be returned in a different order, e.g.,
>
> fsolve returns 'x, infodict, ier, mesg' whereas leastsq returns 'x,
>
> cov_x, infodict, mesg, ier'. Some functions make use of the infodict
>
> output whereas some return the same data individually. etc.
>
> If you still believe (as I do) that consistency of optimize
>
> functions should be improved, I can work on it. Let me know
>
> Also +1.
>
> I would add that the call signatures and return values for the user-supplied
>
> function to minimize should be made consistent too.  Currently, some
>
> functions (leastsq) requires the return value to be an array, while others
>
> (anneal and fmin_l_bfgs_b) require a scalar (sum-of-squares of residual).
>
> That seems like a serious impediment to changing algorithms.
>
> I don't see how that would be possible, since it's a difference in
> algorithm, leastsq needs the values for individual observations (to
> calculate Jacobian), the other ones don't care and only maximize an
> objective function that could have arbitrary accumulation.
>
> Otherwise I'm also +1.
> It might be a bit messy during deprecation with double names, and
> there remain different arguments depending on the algorithm, e.g.
> constraints or not, and if constraints which kind, objective value and
> derivative in one function or in two.
>
> Josef
>
>
> +1 on making the input and output of the scipy optimizer routines more
> uniform.
> I have one additional, somewhat related question to the leastsq issue Matt
> mentioned:
> How can I compute the covariance matrix for a likelihood fit using the
> routines in scipy.optimize?

1300 lines in scikits.statsmodels  wrapping some scipy optimizers for
maximum likelihood estimation
e.g. LikelihoodModel
https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/base/model.py#L81
GenericLikelihoodModel
https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/base/model.py#L421
in the latter I use loglike and loglikeobs, depending on whether I
need or want the Jacobian or not.

> As far as I see the only method that returns a covariance matrix (or results
> from with it is easy to compute the covariance matrix) are leastsq and
> curve_fit. The problem with these is that they take a residual vector chi
> and internally compute  chi2 = (chi**2).sum() and minimize that. But for a
> likelihood fit the quantity to be minimized is logL.sum(), i.e. no squaring
> is required, so as far as I can see it is not possible to do a likelihood
> fit with leastsq and curve_fit.
> On the other hand as Matt pointed out most (all) other optimization methods
> (like e.g. fmin) don't do this squaring and summing internally, but the user
> function does it if one is doing a chi2 fit, so using these it is easy to do
> a likelihood fit. But it is not possible to get parameter errors because
> these other optimizers don't return a Hessian or covariance matrix.
> It would be nice if there were a method in scipy.optimize to compute the
> covariance matrix for all optimizers.

fmin and the other optimizer don't have enough assumptions to figure
out whether the Hessian is the covariance matrix of the parameters.
This is only true if the objective function is the loglikelihood. But
there is no reason fmin and the others should assume this.

We had a brief discussion a while ago on the mailinglist which led to
the notes in the leastsq docstring
http://docs.scipy.org/scipy/docs/scipy.optimize.minpack.leastsq/#leastsq
that cov_x also is not always the (unscaled) covariance matrix.

If the objective function comes from a different estimation method for
example, then I think it's usually not the case that the (inverse)
Hessian is the covariance matrix.

> And it would be nice if the super-fast Levenberg-Marquard optimizer called
> by leastsq were also available for likelihood fits.
> Maybe it would be possible to add one method to scipy.optimize to compute
> the covariance matrix after any of the optimizers has run,
> i.e. the best-fit pars have been determined?
> cov = scipy.optimize.cov(func, pars)
> I think this can be done by computing the Hessian via finite differences and
> then inverting it.
> leastsq seems to compute the Hessian from a "Jacobian". This is the part I
> don't understand, but without going into the details, let me ask this:
> Is there something special about the Levenberg-Marquard optimizer that it
> requires the individual observations?

It uses the outer (?) product of the Jacobian (all observations) to
find the improving directions, and the outerproduct of the Jacobian is
a numerical approximation to the Hessian in the
err = y-f(x,params) or loglikelihood case.

The Hessian calculation can break down quite easily (not positive
definite because of numerical problems), the product of the Jacobian
is more robust in my experience and cheaper. (statsmodels
GenericLikelihood has covariance based on Hessian, Jacobian and a
sandwich of the two).

But I never looked at the internals of minpack.

Josef

> Or is it just that the current implementation of _minpack._lmdif (which is
> called by leastsq) was written such that it works this way (i.e. includes a
> computation of cov_x in addition to x) and it could also be written to take
> a scalar func like all the other  optimizers?
> Christoph
>
>
>
>
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>


More information about the SciPy-User mailing list