[SciPy-User] Least-squares fittings with bounds: why is scipy not up to the task?

Ralf Gommers ralf.gommers@googlemail....
Fri Mar 9 15:49:29 CST 2012


On Fri, Mar 9, 2012 at 10:46 PM, Charles R Harris <charlesr.harris@gmail.com
> wrote:

>
>
> On Fri, Mar 9, 2012 at 2:36 PM, Ralf Gommers <ralf.gommers@googlemail.com>wrote:
>
>>
>>
>> On Fri, Mar 9, 2012 at 1:55 AM, Matthew Newville <matt.newville@gmail.com
>> > wrote:
>>
>>> Gael,
>>>
>>>
>>> On Thursday, March 8, 2012 3:07:22 PM UTC-6, Gael Varoquaux wrote:
>>> >
>>> >   I am sorry I am going to react to the provocation.
>>>
>>> And I am sorry that I am going to react to your message.  I think your
>>> reaction is unfair.
>>>
>>>
>>> >   As some one who spends a fair amount of time working on open source
>>> >   software I hear such remarks quite often: 'why is feature foo not
>>> >   implemented in package bar?. I am finding it harder and harder not to
>>> >   react negatively to these emails. Now I cannot consider myself as a
>>> >   contributor to scipy, and thus I can claim that I am not taking your
>>> >   comment personally.
>>>
>>> Where I work (a large scientific user facility), there are lots of
>>> scientists in what I'll presume is Eric's position -- able and willing to
>>> work well with scientific programming tools, but unable to devote the extra
>>> time needed to develop core functionality or maintain much work outside of
>>> their own area of interest.  There are a great many scientists interested
>>> in learning and using python.  Several people there *are* writing
>>> scientific libraries with python.  Similarly in the fields I work in,
>>> python is widely accepted as an important ecosystem.
>>>
>>>
>>> >   Why isn't scipy not up to the task? Will, the answer is quite simple:
>>> >   because it's developed by volunteers that do it on their spare time,
>>> late
>>> >   at night too often, or companies that put some of their benefits in
>>> open
>>> >   source rather in locking down a market. 90% of the time the reason
>>> the
>>> >   feature isn't as good as you would want it is because of lack of
>>> time.
>>> >
>>> >   I personally find that suggesting that somebody else should put more
>>> of
>>> >   the time and money they are already giving away in improving a
>>> feature
>>> >   that you need is almost insulting.
>>>
>>> Well, in some sense, Eric's message is an expression of interest....
>>> Perhaps you would prefer that nobody outside the core group of developers
>>> or mailing list subscribers asked for any new features or clarification of
>>> existing features.
>>>
>>>
>>> >   I am aware that people do not realize how small the group of people
>>> that
>>> >   develop and maintain their toys is. Borrowing from Fernando Perez's
>>> talk
>>> >   at Euroscipy (http://www.euroscipy.org/file/6459?vid=download slide
>>> 80),
>>> >   the number of people that do 90% of the grunt work to get the core
>>> >   scientific Python ecosystem going is around two handfuls.
>>>
>>> Well, Fernando's slides indicate there is a small group that dominates
>>> commits to the projects, then explains, at least partially, why that it
>>> is.  It is *NOT* because scientists expect this work to be done for them by
>>> volunteers who should just work harder.
>>>
>>> There are very good reasons for people to not be involved.  The work is
>>> rarely funded, is generally a distraction from funded work, and hardly ever
>>> "counts" as scientific work.  That's all on top of being a scientist, not a
>>> programmer.  Now, if you'll allow me, I myself am one of the "lucky"
>>> scientific software developers, well-recognized in my own small community
>>> for open source analysis software, and also in a scientific position and in
>>> a group where building tools for better data collection and analysis can
>>> easily be interpreted as part of the job.  In fact, I spend a very
>>> significant amount of my time writing open source software, and work nearly
>>> exclusively in python.
>>>
>>> So, just as as an example of what happens when someone might
>>> "contribute",  I wrote some code (lmfit-py) that could go into scipy and
>>> posted it to this list several months ago.  Many people have expressed
>>> interest in this module, and it has been discussed on this list a few times
>>> in the past few months.  Though lmfit-py is older than Fernando's slides
>>> (it was inspired after being asked several times "Is there something like
>>> IDL's mpfit, only faster and in python?"), it actually follows his
>>> directions of "get involved" quite closely: it is BSD, at github, with
>>> decent documentation, and does not depend on packages other than scipy and
>>> numpy.   Though it's been discussed on this list recently, two responses
>>> from frequent mailing-list responders (you, Paul V) was more along the
>>> lines of  "yes, that could be done, in principle, if someone were up to
>>> doing the work" instead of "perhaps package xxx would work for you".
>>>
>>> At no point has anyone from the scipy team expressed an interest in
>>> putting this into scipy.  OK, perhaps lmfit-py is not high enough quality.
>>> I can accept that.
>>
>>
>> I don't think anyone has doubts about the quality of lmfit. On the
>> contrary, I've asked you to list it on http://scipy.org/Topical_Software(which you did) because I thought it looked interesting, and have directed
>> some users towards your package. The documentation is excellent, certainly
>> better than that of many parts of scipy. The worry with your code is that
>> the maintenance burden may be relatively high, simply because very few
>> developers are familiar with AST. The same for merging it in scipy - one of
>> the core developers will have to invest a significant amount of time
>> wrapping his head around your work.
>>
>> The ideal scenario from my point of view would be this:
>> - lmfit keeps being maintained by you as a separate package for a while
>> (say six months to a year)
>> - it gains more users, who can discover potential flaws and provide
>> feedback. The API can still be changed if necessary.
>> - once it's stabilized a bit more, you propose it again (and more
>> explicitly) for inclusion in scipy
>> - one of the developers does a thorough review and merges it into
>> scipy.optimize
>> - you get commit rights and maintain the code within scipy
>> - bonus points: if you would be interested in improving and reviewing PRs
>> for related code in optimize.
>>
>>
>> Scipy is a very good place to add functionality that's of use in many
>> different fields of science and engineering, but it needs many more active
>> developers. I think this thread is another reminder of that. Some of the
>> criticism in this thread about how hard it is to contribute is certainly
>> justified. I've had the plan for a while (since Fernando's EuroScipy talk
>> actually) to write a more accessible "how to contribute" document than the
>> one Pauli linked to. Besides the mechanics (git, Trac, etc.) it should at
>> least provide some guidance on what belongs in scipy vs. in a scikit, how
>> to get help, how to move a contribution that doesn't get a response
>> forward, etc. I'll try to get a first draft ready within the next week or
>> so.
>>
>>
> I wonder if it would be useful to put a reference to lmfit in the leastsq
> documentation? I know that would need to be temporary and that referencing
> something outside scipy is unusual, but it might help increase the number
> of users and help it on it's way.
>

Fine with me. I actually think we can do this more often, both for packages
that may be included in scipy later and for pacakges like
scikits.image/statsmodels/learn.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20120309/bf7661f3/attachment.html 


More information about the SciPy-User mailing list