[SciPy-User] Alternatives to scipy.optimize

Matt Newville newville@cars.uchicago....
Mon Feb 27 13:12:11 CST 2012


Hi Nathaniel,

On Mon, Feb 27, 2012 at 10:33 AM, Nathaniel Smith <njs@pobox.com> wrote:
> On Sun, Feb 26, 2012 at 3:14 PM, Matthew Newville
> <matt.newville@gmail.com> wrote:
>>     from lmfit import Parameter
>>     Parameter(name='a', value=10, vary=True)
>>     Parameter(name='b', expr='sqrt(a) + 1')
>>
>> will compile 'sqrt(a)+1' into its AST representation and evaluate that for
>> the value of 'b' when needed.  So lmfit doesn't so much manipulate the AST
>> as interpret it.  What is  manipulated is the namespace, so that 'a' is
>> interpreted as "look up the current value of Parameter 'a'" when the AST is
>> evaluated.   Again, this applies only for algebraic constraints on
>> parameters.
>>
>> Having written fitting programs that support user-supplied algebraic
>> constraints between parameters in Fortran77, I find interpreting python's
>> AST to be remarkably simple and robust.  I'm scared much more by statistical
>> modeling of economic data ;)
>
> So you use the 'ast' module to convert Python source into a syntax
> tree, and then you wrote an interpreter for that syntax tree?

Yes (though just for the expressions for the constraint).  I accept
that this can be viewed as anywhere on the continuum between highly
useful and stark raving mad.    It's such a fine line between stupid
and, uh, ... clever.

> ...Wouldn't it be easier to use Python's interpreter instead of
> writing your own, i.e., just call eval() on the source code?

Probably, though if evaluating an AST tree scares some people, then I
think that eval() would scare the rest even more.  So the goal was
sort of a safe-ish, mathematically-oriented, eval().

> Or are you just using the AST to figure out which variables are referenced?
> (I have some code to do just that without going through the ast
> module, on the theory that it's nice to be compatible with python 2.5,
> but I'm not sure it's really worth it.)

I do figure out which variables are referenced, but do more than that.
I run ast.parse(expression) prior to the fit, and then evaluate by
walking through the resulting tree at each iteration.  When reaching
an ast.Name node, I look up the name in pre-reserved dictionary.
Many python and numpy symbols (sqrt, pi, etc) are automatically
included in this 'namespace'.  For each fit, the names of all the
Parameters are also included prior to the fit.   That gives a
restricted language and namespace for writing constraints.  For a
simple example,

    from lmfit import Parameter
    Parameter(name='a', value=10, vary=True)
    Parameter(name='b', expr='sqrt(a) + 1')
    Parameter(name='c', expr='a + b')

the expression for 'b' is parsed more or less (you can do
ast.dump(ast.parse('sqrt(a) + 1')) for a more complete result) to:
  BinOp(op=Add(),
        left=Call(func=Name('sqrt'), args=[Name('a')]),
        right=Num(1)
	)

that tree is simple to walk, with each node evaluated appropriately.
Evaluation of such a tree is so easy that, although it is not highly
useful for constraint expressions in a fitting problem, the
lmfit.asteval code includes support for while and for loops,
if-then-else, and try-except, as well as full slicing and attribute
lookups for both evaluation and assignment.  The ast module does the
parsing and gives a walkable tree -- amazing and beautifl, and
standard python (for python 2.6+).

As mentioned above, the evaluation of these constraints happens at
each iteration of the python function to be minimized by
scipy.optimize.leastsq, or others.   Dependencies (ie, that 'c'
depends on 'a' and 'b' and that 'b' depends on 'a') are recorded so
that 'b' above is evaluated once per fitting loop.   Admittedly,
that's probably a minor optimization, but it is in the
lmfit/minimizer.py if anyone is looking.

This approach gives a user a lot of flexibility in setting up fitting
models, and can allow the fitting function to remain relatively static
and written in terms of the "physical" model.  Of course, it is not
going to be as fast as numexpr for array calculations, but it is more
general, and faster ufuncs is not the issue at hand.

Cheers,

--Matt Newville


More information about the SciPy-User mailing list