[SciPy-dev] optimizers module

Matthieu Brucher matthieu.brucher@gmail....
Mon Aug 20 11:34:33 CDT 2007


Hi again ;)

I have already committed
> ./solvers/optimizers/line_search/qubic_interpolation.py
> tests/test_qubic_interpolation.py



qubic should be cubic, no ?


the problems:
> 1. I have implemented stop tolerance x as self.minStepSize.
> However, isn't it more correct to observe |x_prev - x_new| according to
> given by user xtol, than to observe |alpha_prev - alpha_new| according
> to xtol? If the routine is called from a multi-dimensional NL problem
> with known xtol, provided by user, I think it's more convenient and more
> correct to observe |x_prev - x_new| instead of |alpha_prev - alpha_new|
> as stop criterion.



The basic cubic interpolation works on alpha. If you want to implement
another based on x, not problem. I think that as a first step, we should add
standard algorithms that are documented and described. After this step is
done, we can explore.


2. (this is primarily for Matthieu): where should be gradtol taken from?
> It's main stop criterion, according to alg.
> Currently I just set it to 1e-6.



It should be taken in the constructor (see the damped_line_search.py for
instance)


3. Don't you think that maxIter and/or maxFunEvals rule should be added?
> (I ask because I didn't see that ones in Matthieu's
> quadratic_interpolation solver).



That is a good question that raised also in our discussion for the Strong
Wolfe Powell rules, at least for the maxIter.


It will make algorithms more stable to
> CPU-hanging errors because of our errors and/or special funcs encountered.
> I had implemented that ones but since Matthieu didn't have them in
> quadratic_interpolation, I just comment out the stop criteria (all I can
> do is to set my defaults like 400 or 1000 (as well as gradtol 1e-6), but
> since Matthieu "state" variable (afaik) not have those ones - I can't
> take them as parameters).
> So should they exist or not?



If you want to use them, you should put them in the __init__ method as well.

The state could be populated with everything, but that would mean very
cumbersome initializations. On one hand, you should create each module with
no parameter and pass all of them to the optimizer. That could mean a very
long and not readable line. On the other hand, you should create the
optimizer, create then every module with the optimizer as a parameter. Not
intuitive enough.
This is were the limit between the separation principle and the object
orientation is fuzzy.
So the state dictionary is only responsible for what is specifically
connected to the function. Either the parameters, or different evaluations
(hessian, gradient, direction and so on). That's why you "can't" put gradtol
in it (for instance).

I saw that you test for the presence of the gradient method, you should not.
If people want to use this line search, they _must_ provide a gradient. If
they can't provide an analytical gradient, they can provide a numerical one
by using helpers.ForwardFiniteDifferenceDerivatives. This is questionable, I
know, but the simpler the algorithm, the simpler their use, their reading
and debugging (that way, you can get rid of the f_and_df function as well or
at least of the test).

Matthieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/scipy-dev/attachments/20070820/ce216528/attachment.html 


More information about the Scipy-dev mailing list