[SciPy-User] Least-squares fittings with bounds: why is scipy not up to the task?
Fri Mar 9 10:40:01 CST 2012
On Thu, Mar 8, 2012 at 7:55 PM, Matthew Newville
> On Thursday, March 8, 2012 3:07:22 PM UTC-6, Gael Varoquaux wrote:
>> I am sorry I am going to react to the provocation.
> And I am sorry that I am going to react to your message. I think your
> reaction is unfair.
>> As some one who spends a fair amount of time working on open source
>> software I hear such remarks quite often: 'why is feature foo not
>> implemented in package bar?. I am finding it harder and harder not to
>> react negatively to these emails. Now I cannot consider myself as a
>> contributor to scipy, and thus I can claim that I am not taking your
>> comment personally.
> Where I work (a large scientific user facility), there are lots of
> scientists in what I'll presume is Eric's position -- able and willing to
> work well with scientific programming tools, but unable to devote the extra
> time needed to develop core functionality or maintain much work outside of
> their own area of interest. There are a great many scientists interested in
> learning and using python. Several people there *are* writing scientific
> libraries with python. Similarly in the fields I work in, python is widely
> accepted as an important ecosystem.
>> Why isn't scipy not up to the task? Will, the answer is quite simple:
>> because it's developed by volunteers that do it on their spare time,
>> at night too often, or companies that put some of their benefits in open
>> source rather in locking down a market. 90% of the time the reason the
>> feature isn't as good as you would want it is because of lack of time.
>> I personally find that suggesting that somebody else should put more of
>> the time and money they are already giving away in improving a feature
>> that you need is almost insulting.
> Well, in some sense, Eric's message is an expression of interest.... Perhaps
> you would prefer that nobody outside the core group of developers or mailing
> list subscribers asked for any new features or clarification of existing
>> I am aware that people do not realize how small the group of people that
>> develop and maintain their toys is. Borrowing from Fernando Perez's talk
>> at Euroscipy (http://www.euroscipy.org/file/6459?vid=download slide 80),
>> the number of people that do 90% of the grunt work to get the core
>> scientific Python ecosystem going is around two handfuls.
> Well, Fernando's slides indicate there is a small group that dominates
> commits to the projects, then explains, at least partially, why that it is.
> It is *NOT* because scientists expect this work to be done for them by
> volunteers who should just work harder.
> There are very good reasons for people to not be involved. The work is
> rarely funded, is generally a distraction from funded work, and hardly ever
> "counts" as scientific work. That's all on top of being a scientist, not a
> programmer. Now, if you'll allow me, I myself am one of the "lucky"
> scientific software developers, well-recognized in my own small community
> for open source analysis software, and also in a scientific position and in
> a group where building tools for better data collection and analysis can
> easily be interpreted as part of the job. In fact, I spend a very
> significant amount of my time writing open source software, and work nearly
> exclusively in python.
> So, just as as an example of what happens when someone might "contribute",
> I wrote some code (lmfit-py) that could go into scipy and posted it to this
> list several months ago. Many people have expressed interest in this
> module, and it has been discussed on this list a few times in the past few
> months. Though lmfit-py is older than Fernando's slides (it was inspired
> after being asked several times "Is there something like IDL's mpfit, only
> faster and in python?"), it actually follows his directions of "get
> involved" quite closely: it is BSD, at github, with decent documentation,
> and does not depend on packages other than scipy and numpy. Though it's
> been discussed on this list recently, two responses from frequent
> mailing-list responders (you, Paul V) was more along the lines of "yes,
> that could be done, in principle, if someone were up to doing the work"
> instead of "perhaps package xxx would work for you".
> At no point has anyone from the scipy team expressed an interest in putting
> this into scipy. OK, perhaps lmfit-py is not high enough quality. I can
> accept that. My point is that there *is* a contribution but one that would
> not show up on Fernando's graph as a lengthening of "the tail of
> contributors". There ARE a few developers out there who are interested in
> making contributions, and the scipy team is not doing everything it could be
> doing to either facilitate or even encourage such participation. In fact,
> especially given your response, it would be possible to conclude that
> contributions are actually discouraged. It's also possible to be more
> optimistic, and conclude that Fernando's statistics are accurate only for
> each project shown, but wildly underestimate the whole of the community.
I think lmfit is a good project, it can be easy installed. You are
able to maintain and develop it.
So I don't think the need to have it in scipy is very urgent.
On the other hand, for anyone not familiar with AST manipulation it
feels to me like a possible maintenance nightmare.
It doesn't mean it is, but as part of a community project it should be
possible to maintain (or come with a maintainer).
But maybe I have just seen to much stranded and broken code in scipy
(that remained neglected for years).
As an example for a contribution: fisher's exact test, a pretty
important function, but didn't quite work for several cases. I spend
several days trying to figure out how to fix it. I was not successfull
since I was not familiar with the algorithm and the numerical problems
it raised. A while later users or the original developer found ways to
fix the corner cases. At that stage it was possible to include it in
scipy. (There were a few additional edge cases afterwards, but that
were minor fixes.)
As a positive example, Denis Laxalde became very active and is
revamping and improving large parts of the scipy.optimize code.
>> I'd like to think that it's a problem of skill set: users that have the
>> ability to contribute are just too rare. This is not entirely true, there
>> are scores of skilled people on the mailing lists. You yourself mention
>> that you are developing a package.
> There are many kinds of skills. Sometimes, not insulting your customers,
> colleagues, and potential collaborators is the most important one.
>> Sorry for the rant, but if you want things to improve, you will have more
>> successes sending in pull request than messages on mailing list that
>> sound condescending to my ears.
>> I hope that I haven't overreacted too badly.
> Sorry, but I think you have. I'm impressed that Eric was appreciative -- I
> know many who would not be.
> For myself, I find it quite discouraging that the scipy team is so insular.
> --Matt Newville
> SciPy-User mailing list
More information about the SciPy-User