[SciPy-User] fmin_slsqp exit mode 8

josef.pktd@gmai... josef.pktd@gmai...
Fri Sep 28 18:24:18 CDT 2012


On Fri, Sep 28, 2012 at 2:09 PM, Pauli Virtanen <pav@iki.fi> wrote:
> 27.09.2012 18:15, josef.pktd@gmail.com kirjoitti:
>> in statsmodels we  have a case where fmin_slsqp ends with mode=8
>> "POSITIVE DIRECTIONAL DERIVATIVE FOR LINESEARCH"
>>
>> Does anyone know what it means and whether it's possible to get around it?
>>
>> the fortran source file doesn't have an explanation.
>
> Guessing without wading through the F77 goto sphagetti: it could mean
> that the optimizer has wound up with a search direction in which the
> function increases (or doesn't decrease fast enough). If it's an
> termination condition, it probably also means that the optimizer is not
> able to recover from this.

I had tried some randomization as new starting values, but in this
example this didn't help.

>
> Some googling seems to indicate that this depends on the scaling of the
> prolem, so it may also be some sort of a precision issue (or an issue
> with wrong tolerances):
>
> http://www.mail-archive.com/nlopt-discuss@ab-initio.mit.edu/msg00208.html

scaling might be a problem in this example

hessian, second derivative of the unpenalized likelihood function
>>> np.linalg.eigvals(poisson_l1_res._results.model.hessian(poisson_l1_res.params))
array([-16078553.93225711,  -1374997.42454279,   -299647.67457668,
         -138719.26843099,    -15800.99493306,     -1091.16078941,
          -10258.71018359,     -3800.22940286,     -7530.7029302 ,
           -6540.09128479])

Maybe it's just a bad example to use for L1 penalization.

----
I tried to scale down the objective function and gradient, and it works

np.linalg.eigvals(poisson_l1_res._results.model.hessian(poisson_l1_res.params))
array([-588.82869149,  -64.89601886,  -13.81251974,   -6.90900488,
         -0.74415772,   -0.48190709,   -0.03863475,   -0.34855895,
         -0.28063095,   -0.16671642])

I can impose a high penalization factor and still get a successful
mode=0 convergence.
I'm not sure the convergence has actually improved in relative terms.


(Now I just have to figure out if we want to consistently change the
scaling of the loglikelihood, or just hack it into L1 optimization.)

Thanks for the hint,

Josef


>
> --
> Pauli Virtanen
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


More information about the SciPy-User mailing list