[SciPy-dev] Scipy workflow (and not tools).

Anne Archibald peridot.faceted@gmail....
Wed Feb 25 13:53:09 CST 2009


2009/2/25 David Cournapeau <cournape@gmail.com>:
> On Thu, Feb 26, 2009 at 4:22 AM, Charles R Harris
> <charlesr.harris@gmail.com> wrote:
>>
>> On Wed, Feb 25, 2009 at 11:58 AM, Matthew Brett <matthew.brett@gmail.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> >> Tests protect the user and the developer alike.  It is irresponsible
>>> >> to carry on the way we do.
>>> >>
>>> > No it's not.
>>>
>>> Scipy is rarely released.  David and Stefan are saying that it is very
>>> hard to release.
>>>
>>> It might be true, that continuing with the organic, 'add it if it
>>> seems good' approach, will be fine.   But it might also be true that
>>> it will make Scipy grind to a halt, as it becomes too poorly
>>> structured and tested to maintain.
>>> http://anyall.org/blog/2009/02/comparison-of-data-analysis-packages-r-matlab-scipy-excel-sas-spss-stata/
>>>
>>> rates Numpy / Scipy / matplotlib as 'immature'.  This is mainly
>>> because of Scipy, and it's fair.  It we want it to change we have to
>>> be able to release versions that have good documentation and low bug
>>> counts.
>>>
>>> The choices we make now are going to have long-lasting consequences for
>>> Scipy.
>>>
>>> I think our best guess, from what David and Stefan are saying, that we
>>> need a change towards more structured process.  I stress the word
>>> "need".  This doesn't seem surprising to me.   I think we've got to
>>> listen to them, because they are doing the work of maintaining and
>>> releasing Scipy.
>>
>> Much of Scipy *isn't* maintained, that is why it is immature. There are
>> parts that need to be worked over and rationalized and that isn't happening.
>> You can't review code that hasn't been written. Some of that is history: the
>> initial impetus in Scipy was interfacing existing C and Fortran libraries
>> with Python and scratching itches. But that isn't the same as putting
>> together a large package with smoothly interacting parts and verified
>> results. And before that can happen we need more people working on the
>> parts.
>
> Also, if the problem is man power, adding more code which makes the
> whole package more difficult to handle does not sound like a future
> proof path. Unless the goal of scipy is to become a bag of tricks
> which may be useful to some people, without any commitment from our
> side.
>
> Some parts of scipy are difficult to maintain because they have no
> tests and no documentation - it is not even obvious what it is
> supposed to do. I am afraid we can't have it both ways: if we want to
> increase quality, given man power, we have to reduce the amount of
> code which requires constant attention. If we want more features
> first, then, we can continute like we do now. But then we can't expect
> constant releases, which are relatively well tested.

It seems to me that one reason for the current disagreement is that
people are talking about two different things:

(1) Getting new code written from scratch and into the repository, and
(2) Getting (and keeping) the code we have working reliably.

For (1), tests and documentation are indeed a barrier (albeit in my
opinion a very low one). For (2), though, requiring tests and
documentation will drastically  decrease the effort required.

Put another way: some people are arguing that not requiring tests or
documentation will get more people contributing new code. Others are
arguing that allowing code without tests or documentation into the
trunk will increase the manpower required to do basic things like make
releases.

Personally, I don't think requiring tests and documentation is a
barrier to new users. I was very hesitant about my first contribution
because I really didn't want to put in broken or embarrassingly bad
code, so the fact that I could test it systematically, confirm that it
didn't break anything else, and document it clearly made me more
confident that I was contributing something that wouldn't require me
to wear a paper bag on my head.

But let's assume that it is a barrier to contributing new code. Which
does scipy need more right now: reliability in the code it has and a
regular release cycle, or lots more new code?


Anne


More information about the Scipy-dev mailing list