[SciPy-User] Alternatives to scipy.optimize

Matt Newville newville@cars.uchicago....
Mon Feb 27 13:50:05 CST 2012


Hi Josef,

On Mon, Feb 27, 2012 at 9:56 AM,  <josef.pktd@gmail.com> wrote:
> On Sun, Feb 26, 2012 at 10:14 AM, Matthew Newville
> <matt.newville@gmail.com> wrote:
>>
>> Hi Erik, Josef,
>>
>>
>> On Saturday, February 25, 2012 8:21:43 PM UTC-6, Erik Petigura wrote:
>>>
>>> Thanks for getting back to me!
>>> I'd like to minimize p1 and p2 together.  Let me try to describe my
>>> problem a little better:
>>>
>>> I'm trying to fit an exoplanet transit light curve.  My model is a box + a
>>> polynomial trend.
>>>
>>> https://gist.github.com/1912265
>>>
>>> The polynomial coefficients and the depth of the box are linear
>>> parameters, so I want to
>>> fit them using linear least squares.  The center and width of the transit
>>> are non-linear
>>> so I need to fit them with an iterative approach like optimize.fmin.
>>> Here's how I implemented it.
>>>
>>> https://gist.github.com/1912281
>>
>> I'm not sure I fully follow your model, but if I understand correctly,
>> you're looking to find optimal parameters for something like
>>   model = linear_function(p1) + nonlinear_function(p2)
>
> yes, I've read about this mostly in the context of semiparametric
> versions when the nonlinear function does not have a parametric form.
>
> http://en.wikipedia.org/wiki/Semiparametric_regression#Partially_linear_models
>
>>
>> for sets of coefficients p1 and p2, each set having a few fitting variables,
>> some of which may be related.  Is there an instability that prevents you
>> from just treating this as a single non-linear model?
>
> I think p1 and p2 shouldn't have any cross restrictions or it will get
> a bit more complicated.
> The main reason for splitting it up is computational, I think. It's
> quite common in econometrics to "concentrate out" parameters that have
> an explicit solution, so we need the nonlinear optimization only for a
> smaller parameter space.
>
>
>>
>> Another option might be to have the residual function for
>> scipy.optimize.leastsq (or lmfit) call numpy.linalg.lstsq at each
>> iteration.  I would think that more fully explore the parameter space than
>> first fitting nonlinear_function with scipy.optimize.fmin() then passing
>> those best-fit parameters to numpy.linalg.lstsq(), but perhaps I'm not fully
>> understanding the nature of the problem.
>
> That's what both of us did, my version https://gist.github.com/1911544

Right, that does seem to be the preferred solution here, though it
wasn't completely clear to me that Erik meant that the parameters in
p1 and p2 were always decoupled.  I may not have understood the model
(and there were a lot of objects named 'p'!).   Allowing coupling
between some of the elements of p1 and p2 would seem potentially
useful to me.

>>> There is a lot unpacking and repacking the parameter array as it gets
>>> passed around
>>> between functions.  One option that might work would be to define
>>> functions based on a
>>> "parameter object".  This parameter object could have attributes like
>>> float/fix,
>>> linear/non-linear.  I found a more object oriented optimization module
>>> here:
>>> http://newville.github.com/lmfit-py/
>>>
>>> However, it doesn't allow for linear fitting.
>>
>> Linear fitting could probably be added to lmfit, though I haven't looked
>> into it.   For this problem, I would pursue the idea of treating your
>> fitting problem as a single model for non-linear least squares with
>> optimize.leastsq or with lmfit.   Perhaps I missing something about your
>> model that makes this approach unusually challenging.
>>
>>
>>
>> Josef P wrote:
>>
>>> The easiest is to just write some helper functions to stack or unstack
>>> the parameters, or set some to fixed. In statsmodels we use this in
>>> some cases (as methods since our models are classes), also to
>>> transform parameters.
>>> Since often this affects groups of parameters, I don't know if the
>>> lmfit approach would helps in this case.
>>
>> If many people who are writing their own model functions find themselves
>> writing similar helper functions to stack and unstack parameters, "the
>> easiest" here might not be "the best", and providing tools to do this
>> stacking and unstacking might be worthwhile.   Lmfit tries to do this.
>>
>>> (Personally, I like numpy arrays with masks or fancy indexing, which
>>> is easy to understand. Ast manipulation scares me.)
>>
>> I don't understand how masks or fancy indexing would help here. How would
>> that work?
>
> ---- <detour warning>
> a few examples how we currently handle parameter restrictions in statsmodels
>
> Skippers example: transform parameters in the loglikelihood to force
> the parameter estimates of an ARMA process to produce a stationary
> (stable) solution - transforms groups of parameters at once
> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/tsa/arima_model.py#L230
> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/tsa/arima_model.py#L436
>
> not a clean example: fitting distributions with some frozen parameters
> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/sandbox/distributions/sppatch.py#L267
> select parameters that are not frozen to use with fmin
>     x0  = np.array(x0)[np.isnan(frmask)]
> expand the parameters again to include the frozen parameters inside
> the loglikelihood function
>     theta = frmask.copy()
>     theta[np.isnan(frmask)] = thetash
>
> It's not as user friendly as the version that got into
> scipy.stats.distribution, (but it's developer friendly because I don't
> have to stare at it for hours to spot a bug)
>
> structural Vector Autoregression uses a similar mask pattern
> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/tsa/vector_ar/svar_model.py#L89
>
> In another sandbox model I build a nested dictionary to map the model
> parameters to the reduced list that can be fed to
> scipy.optimize.fmin_xxx
>
> But we don't have user defined nonlinear constraints yet.

Ah, thanks -- I think I understand what you're doing, at least in some
of the examples.  You're using certain ranges of an array of parameter
values to be treated differently, possibly with some able to be fixed
or bounded.

As you no doubt understand, the approach in lmfit is completely
different, and much more flexible.

Instead of the user writing a function for leastsq() that takes as the
first argument an array of parameter values, they write a function
that takes as the first argument an array of Parameters.  At each
iteration, the Parameters will have up-to-date value, after apply the
bounds and constraint expression as set for each parameter.  The point
is that someone can write the function once, in terms of named,
physical parameters, but then a user change whether any of the
parameters are varied/fixed, have bounds, or are constrained to a
mathematical expression in terms of other variables without changing
the function that calculates the residual.

> ---------
>>
>> FWIW, lmfit uses python's ast module only for algebraic constraints between
>> parameters.  That is,
>>     from lmfit import Parameter
>>     Parameter(name='a', value=10, vary=True)
>>     Parameter(name='b', expr='sqrt(a) + 1')
>>
>> will compile 'sqrt(a)+1' into its AST representation and evaluate that for
>> the value of 'b' when needed.  So lmfit doesn't so much manipulate the AST
>> as interpret it.  What is  manipulated is the namespace, so that 'a' is
>> interpreted as "look up the current value of Parameter 'a'" when the AST is
>> evaluated.   Again, this applies only for algebraic constraints on
>> parameters.
>
> It's a bit similar to a formula framework for specifying a statistical
> model that is to be estimated (with lengthy discussion on the
> statsmodels list).

Sorry, I don't follow that discussion list (a little outside my
field), and wasn't aware of a formula framework in statsmodels.  How
does that compare?

> I see the advantages but I haven't spent the weeks of time to figure
> out what's behind the machinery that is required (especially given all
> the other statistics and econometrics that is missing, and where I
> only have to worry about how numpy, scipy and statsmodels behaves. And
> I like Zen of Python #2)

Fair enough.

The code in lmfit/asteval.py and lmfit/astutils.py is < 1000 Lines, is
BSD, and imports only from:
   ast, math, (numpy), os, re, sys, __future__

That is, the import of numpy is tried, and symbols from numpy will be
used if available.   Python 2.6+ is required, as the ast module
changed quite a bit between 2.5 and 2.6.  The 'import from __future__'
are for division and print_function, for Python3 compatibility.

>> Having written fitting programs that support user-supplied algebraic
>> constraints between parameters in Fortran77, I find interpreting python's
>> AST to be remarkably simple and robust.  I'm scared much more by statistical
>> modeling of economic data ;)
>
> different tastes and applications.
> I'd rather think about the next 20 statistical tests I want to code,
> than about the AST or how sympy translates into numpy.

OK. That's all completely fair. I'm just saying it's not that hard,
and also exists if you're interested.

Cheers,

--Matt Newville


More information about the SciPy-User mailing list