[SciPy-User] Least-squares fittings with bounds: why is scipy not up to the task?

josef.pktd@gmai... josef.pktd@gmai...
Fri Mar 9 10:40:01 CST 2012


On Thu, Mar 8, 2012 at 7:55 PM, Matthew Newville
<matt.newville@gmail.com> wrote:
> Gael,
>
>
> On Thursday, March 8, 2012 3:07:22 PM UTC-6, Gael Varoquaux wrote:
>>
>>   I am sorry I am going to react to the provocation.
>
> And I am sorry that I am going to react to your message.  I think your
> reaction is unfair.
>
>
>>   As some one who spends a fair amount of time working on open source
>>   software I hear such remarks quite often: 'why is feature foo not
>>   implemented in package bar?. I am finding it harder and harder not to
>>   react negatively to these emails. Now I cannot consider myself as a
>>   contributor to scipy, and thus I can claim that I am not taking your
>>   comment personally.
>
> Where I work (a large scientific user facility), there are lots of
> scientists in what I'll presume is Eric's position -- able and willing to
> work well with scientific programming tools, but unable to devote the extra
> time needed to develop core functionality or maintain much work outside of
> their own area of interest.  There are a great many scientists interested in
> learning and using python.  Several people there *are* writing scientific
> libraries with python.  Similarly in the fields I work in, python is widely
> accepted as an important ecosystem.
>
>
>>   Why isn't scipy not up to the task? Will, the answer is quite simple:
>>   because it's developed by volunteers that do it on their spare time,
>> late
>>   at night too often, or companies that put some of their benefits in open
>>   source rather in locking down a market. 90% of the time the reason the
>>   feature isn't as good as you would want it is because of lack of time.
>>
>>   I personally find that suggesting that somebody else should put more of
>>   the time and money they are already giving away in improving a feature
>>   that you need is almost insulting.
>
> Well, in some sense, Eric's message is an expression of interest.... Perhaps
> you would prefer that nobody outside the core group of developers or mailing
> list subscribers asked for any new features or clarification of existing
> features.
>
>
>>   I am aware that people do not realize how small the group of people that
>>   develop and maintain their toys is. Borrowing from Fernando Perez's talk
>>   at Euroscipy (http://www.euroscipy.org/file/6459?vid=download slide 80),
>>   the number of people that do 90% of the grunt work to get the core
>>   scientific Python ecosystem going is around two handfuls.
>
> Well, Fernando's slides indicate there is a small group that dominates
> commits to the projects, then explains, at least partially, why that it is.
> It is *NOT* because scientists expect this work to be done for them by
> volunteers who should just work harder.
>
> There are very good reasons for people to not be involved.  The work is
> rarely funded, is generally a distraction from funded work, and hardly ever
> "counts" as scientific work.  That's all on top of being a scientist, not a
> programmer.  Now, if you'll allow me, I myself am one of the "lucky"
> scientific software developers, well-recognized in my own small community
> for open source analysis software, and also in a scientific position and in
> a group where building tools for better data collection and analysis can
> easily be interpreted as part of the job.  In fact, I spend a very
> significant amount of my time writing open source software, and work nearly
> exclusively in python.
>
> So, just as as an example of what happens when someone might "contribute",
> I wrote some code (lmfit-py) that could go into scipy and posted it to this
> list several months ago.  Many people have expressed interest in this
> module, and it has been discussed on this list a few times in the past few
> months.  Though lmfit-py is older than Fernando's slides (it was inspired
> after being asked several times "Is there something like IDL's mpfit, only
> faster and in python?"), it actually follows his directions of "get
> involved" quite closely: it is BSD, at github, with decent documentation,
> and does not depend on packages other than scipy and numpy.   Though it's
> been discussed on this list recently, two responses from frequent
> mailing-list responders (you, Paul V) was more along the lines of  "yes,
> that could be done, in principle, if someone were up to doing the work"
> instead of "perhaps package xxx would work for you".
>
> At no point has anyone from the scipy team expressed an interest in putting
> this into scipy.  OK, perhaps lmfit-py is not high enough quality.  I can
> accept that.  My point is that there *is* a contribution but one that would
> not show up on Fernando's graph as a lengthening of "the tail of
> contributors". There ARE a few developers out there who are interested in
> making contributions, and the scipy team is not doing everything it could be
> doing to either facilitate or even encourage such participation.  In fact,
> especially given your response, it would be possible to conclude that
> contributions are actually discouraged.  It's also possible to be more
> optimistic, and conclude that Fernando's statistics are accurate only for
> each project shown, but wildly underestimate the whole of the community.

I think lmfit is a good project, it can be easy installed. You are
able to maintain and develop it.
So I don't think the need to have it in scipy is very urgent.

On the other hand, for anyone not familiar with AST manipulation it
feels to me like a possible maintenance nightmare.
It doesn't mean it is, but as part of a community project it should be
possible to maintain (or come with a maintainer).

But maybe I have just seen to much stranded and broken code in scipy
(that remained neglected for years).

As an example for a contribution: fisher's exact test, a pretty
important function, but didn't quite work for several cases. I spend
several days trying to figure out how to fix it. I was not successfull
since I was not familiar with the algorithm and the numerical problems
it raised. A while later users or the original developer found ways to
fix the corner cases. At that stage it was possible to include it in
scipy. (There were a few additional edge cases afterwards, but that
were minor fixes.)

As a positive example, Denis Laxalde became very active and is
revamping and improving large parts of the scipy.optimize code.

Josef

>
>
>>  I'd like to think that it's a problem of skill set: users that have the
>>  ability to contribute are just too rare. This is not entirely true, there
>>  are scores of skilled people on the mailing lists. You yourself mention
>>  that you are developing a package.
>
> There are many kinds of skills.  Sometimes, not insulting your customers,
> colleagues, and potential collaborators is the most important one.
>
>
>>  Sorry for the rant, but if you want things to improve, you will have more
>>  successes sending in pull request than messages on mailing list that
>>  sound condescending to my ears.
>>
>>  I hope that I haven't overreacted too badly.
>
> Sorry, but I think you have.  I'm impressed that Eric was appreciative -- I
> know many who would not be.
>
> For myself, I find it quite discouraging that the scipy team is so insular.
> Cheers,
>
> --Matt Newville
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


More information about the SciPy-User mailing list