[SciPy-dev] scipy.optimize.nonlin rewrite
Gideon Simpson
simpson@math.toronto....
Sun Nov 30 17:08:24 CST 2008
Still no args input for inputting arguments to the function F?
Sorry to complain, but the absence of this has put me off using these
routines as it would require a rewrite of much of my code.
-gideon
On Nov 29, 2008, at 2:40 PM, Pauli Virtanen wrote:
> Hi all,
>
> I spent some time rewriting the scipy.optimize.nonlin module:
>
> http://github.com/pv/scipy/tree/ticket-791/scipy/optimize/nonlin.py
>
> The following things changed:
>
> - Support tolerance-based stopping conditions (cf. ticket #791)
>
> - Improved handling of input and return values from the functions,
> so that they are now easier to use.
>
> - Don't use np.matrix at all.
>
> - Jacobian approximations factored into classes; the iteration code is
> now in only one place, so trust-region handling or other improvements
> can be added to a single place later on.
>
> - There's now a line search to the direction inverting the Jacobian
> gave. (But there's no checking that it is an actual descent direction
> for some merit function.)
>
> - Rewrote docstrings.
>
> The routines should produce the same output as previously. The tests
> are
> still not very strong, however.
>
>
> But I have now some questions:
>
> * Getting this to 0.7.0; this is a complete rewrite, so is it too
> late,
> and is it better to wait for 0.8?
>
> * Some of the algorithms in there don't appear to work too well, and
> some appear to be redundant. I'd like to clean up this a bit, leaving
> only known-good stuff in.
>
> * I'd like to remove `broyden2` as the actual Jacobian approximation
> in
> this appears to be the same as in `broyden3`, and there does not
> appear to be much difference in the work involved in the two.
>
> Ondrej, since you wrote the original code, do you think there is
> a reason to keep both?
>
> * `broyden1_modified` appears to be, in the end if you work out the
> matrix algebra, updating the inverse Jacobian in a way that
> corresponds to
>
> J := J + (y - J s / 2) s^T / ||s||^2
>
> for the Jacobian (with s = dx, y = df). Apart from the factor
> 1/2, it's Broyden's good method. [1] One can also verify that the
> updated inverse Jacobian does not satisfy the quasi-Newton condition
> s = J^{-1} y, and that `broyden1_modified` doesn't generate the same
> sequence as `broyden1`.
>
> Hence, I'd like to remove this routine, unless there's some
> literature
> that shows that the above works better than Broyden's method; Ondrej,
> do you agree?
>
> .. [1] http://en.wikipedia.org/wiki/Broyden%27s_method
> http://en.wikipedia.org/wiki/Sherman%E2%80%93Morrison_formula
>
> * Also, which articles were used as reference for the non-Quasi-Newton
> algorithms:
>
> - `broyden_modified`. This appears to be a bit exotic, and it uses
> several magic constants (`w0`, `wl`) whose meaning is not clear to
> me.
>
> A reference would be helpful here, also for the user who has to
> choose the parameters as appropriate for his/her specific problem.
>
> - `broyden_generalized`, `anderson`, `anderson2`. These appear to be
> variants of Anderson mixing, so probably we only want at most
> one of these. Also, broyden_generalized() == anderson(w0=0), am I
> correct?
>
> `anderson` and `anderson2` don't appear to function equivalently,
> and I suspect the update formula in the latter is wrong, since this
> algorithm can't solve any of the test problems. Do you have a
> reference for this?
>
> Is there a rule how `w0` should be chosen for some specific
> problem?
>
> - `excitingmixing`. A reference would be useful, to clarify the
> heuristics used.
>
> - `linearmixing`. I'm not sure that Scipy should advocate this
> method :)
>
> Linearmixing and excitingmixing also seem to require something from
> the objective function, possibly that the eigenvalues of its Jacobian
> have the opposite sign than `alpha`. For example, neither of them can
> find a solution for the equation ``I x = 0`` where ``I`` is the
> identity matrix (test function F2). So, I'm a bit tempted to remove
> also these routines, as it seems they probably are not too useful for
> general problems.
>
> * The code in there is still a bit naive implementation of the inexact
> (Quasi-)Newton method, and one could add eg. add trust-region
> handling
> or try to guarantee that the line search direction is always a
> decrease direction for a merit function. (I don't know if it's
> possible to do the latter unless one has a matrix representation of
> the Jacobian approximation.) So, I suspect there are problems for
> which eg. MINPACK code will find solutions, but for which the
> nonlin.py code fails.
>
> * One could add more algorithms suitable for large-scale problems; for
> example some limited-memory Broyden methods (eg. [1]) or Secant-
> Krylov
> methods [2].
>
> .. [1] http://www.math.leidenuniv.nl/~verduyn/publications/reports/equadiff.ps
> .. [2] D. A. Knoll and D. E. Keyes. Jacobian free Newton-Krylov
> methods.
> Journal of Computational Physics, 20(2):357–397, 2004.
>
> I have implementations for both types of algorithms that could
> possibly go in after some polishing.
>
> --
> Pauli Virtanen
>
>
> _______________________________________________
> Scipy-dev mailing list
> Scipy-dev@scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-dev
More information about the Scipy-dev
mailing list