[SciPy-User] Least-squares fittings with bounds: why is scipy not up to the task?

Eric Emsellem eemselle@eso....
Fri Mar 9 02:46:37 CST 2012


Thanks David for this.

The main issue is not in fact to solve the pb for myself (with some variable 
substitution or..) as I can also think of e.g. interfacing C/fortran efficient 
codes with python via standard wrapping (I had used this with e.g. the amazing 
NAG library with the help of expert programmers).

There are 3 issues here (which are closely related to each others):

- to have such a module integrated in scipy means that new python users would 
find the module by default and do not need to install more and more modules. 
This is one of the problem many people encounter. In the early days of scipy (or 
python) things had to be installed, tuned, re-installed, etc. This was fun but 
does not allow a large community to join. There are efforts to coordinate, 
homogeneise, optimise all this. Scipy is one of these (and an impressive 
success). Astropy is another path specific to astronomy (my field). But for such 
complex routines, we need (I believe) things which are "simple" to use and 
already integrated. I acknowledge this is a huge effort, both to develop the 
module, and integrate it and I am not blaming anyone here (on the contrary, as 
mentioned, I am very impressed by what has been achieved!). I am just saying: I 
believe this is a "must have". People who will need such a module for their own 
goals could then use it transparently.

- if the specifics of the bounds/fixed parameters are in the user-defined 
function itself, then we loose it I think. To me it is then nearly equivalent 
(although slightly better), for a new python user, as having to download and 
install several additional packages. You need to spend some time tuning your 
function, and cannot change it on the fly. On the long run, I would be surprised 
if the "non-advanced" users would really go for this. They would turn to e.g., 
idl or whatever is convenient for them.

- When contributing to an effort like astropy (via e.g., github) and when you do 
post a new package, you would like to avoid requiring the installation of 2-3 
more packages on top of the one you are proposing (even if their installation is 
automatised). At the moment, my package includes mpfit.py as a sub-module. This 
is bad practice (as various packages will have various versions of mpfit maybe, 
and mpfit is not optimised) but this guarantees that the person who downloads 
the package can just rely on that. In astropy, the guideline is that APART from 
matplotlib, scipy/numpy, you shouldn't have to download more if you wish to have 
a specific piece of software work on your computer. This ensures that the 
community reacts positively to this coordinating effort (which is very 
significant) and that it will attract more and more people around these 
beautiful developments, namely numpy, scipy et al.

Of course, this is just a biased opinion from a non-expert python user! :-)

cheers
Eric

On 03/09/2012 04:14 AM, David Baddeley wrote:
>  From a pure performance perspective, you're probably going to be best setting
> your bounds by variable substitution (particularly if they're only single-ended
> - x**2 is cheap) - you really don't want to have the for loops, dictionary
> lookups and conditionals that lmfit introduces for it's bounds checking inside
> your objective function.
>
> I think a high level wrapper that permitted bounds, an unadulterated goal
> function, and setting which parameters to fit, but also retained much of the raw
> speed of leastsq could be accomplished with some clever on the fly code
> generation (maybe also using Sympy to automatically derive the Jacobian). Would
> make an interesting project ...


More information about the SciPy-User mailing list