[SciPy-dev] Scipy workflow (and not tools).

Michael Abshoff michael.abshoff@googlemail....
Thu Feb 26 11:16:17 CST 2009


Stéfan van der Walt wrote:
> 2009/2/26 Jonathan Guyer <guyer@nist.gov>:

Hi,

>> If I may very unfairly summarize the debate this far:
>>
>> Stéfan, et al.: There aren't enough tests. All code must have tests!
>> Under penalty of death!
>> Travis, et al.: But then we we shall have neither code *nor* tests!
>> Stéfan, et al.: Good!
>
> I agree, your summary isn't accurate: the world isn't that black and
> white.  To summarise a profound story I heard the other day:
> 
> "A wise man first dams up the river, before trying to empty the lake."
> 
> Unless we test new code, how do we make time to clean up the old code?
> 
> I certainly would not like to see SciPy stagnate because of hard-line
> policy.  My fear is, that unless we do something in order to protect
> our scarce developer resources, we will see the same thing happening.

Well, I have been following this thread so far and abstained from 
commenting except for one small email, but a couple more comments from 
the peanut gallery knowing too well that most people see things 
differently to the conclusion I will reach at the end:

1. If code is not tested it is broken. It might not be broken on your 
machine, but how about different endianess or pointer sizes? How about 
different compilers, BLAS backends, etc.

2. Tests illustrate how to use the code and that makes code more 
accessible for a person who is not the author to use as well as improve 
the code. And good coverage also lessens the risk that changes on one 
end of the codebase break things in other places.

I believe the two points above are more or less agreed upon in the 
discussion here so far by everybody (probably not the "untested code is 
broken bit"), the main dividing issue is whether imposing "too much" or 
"too restrictive" process will hurt Scipy more than it will help.

Tests can also be used to do other things, i.e. in Sage we use doctests 
to automatically look for leaks as well as to look for speed regression. 
The speed regression code is not yet automatically run, but due to 
experiencing slowdowns over and over again it is a high priority thing 
to get the needed infrastructure bits finished.


Now the rather controversial point: In Sage we do not want your 
contribution if you do not have 100% coverage. There is code sitting in 
trac where the person has been told to get coverage up to standard, 
otherwise the code will not be merged. Obviously if the code is good and 
just missing some coverage the reviewer is often willing to add the 
missing bits to get the code up to standard. But we will not merge code 
that isn't fully tested or has known failures.

The decision to mandate review and coverage was made after a not too 
long discussion in person at a Sage Days in September 2007 and it took a 
while to convince the last person that this was a good idea. We lost one 
developer who was not willing to write doctests for his code and all I 
can say is "good riddance". It wasn't a particularly important piece of 
code he was working on and I believe that also other external factors 
made that person drop out from the Sage project.

Losing a person might seem bad and it certainly is not a good thing, but 
while the policy of mandatory coverage and then later on of mandatory 
review caused pain and required adjustments to the way we did things it 
has shown great term long term benefits. Review does not catch every 
issue, but it sure as hell catches a lot before the code goes even in. 
And once things have been merged they are in and unless someone takes 
the time right there and then to review the issues will likely just melt 
into the background. If you look into Sage into areas with low coverage 
you will find code that is in need of a lot of attention. That code was 
merged into Sage back in the day before review and coverage requirements 
and it clearly shows. The argument that some people make that people can 
go back and do review once the code is in certainly holds, the problem 
is that there are often more pressing issues to do right now than to 
deal with problems one could deal with later [the code is already in 
tree:)] and so very little happens about that code there and then unless 
it is truly bad. The comment about "damming the river" certainly fits 
very well here.

We can and do release Sage version which are generally better than the 
previous release with much higher frequency than just about any other 
project out there. This is largely possible because of coverage and 
review. At one point we had some critical bugs that popped up at a 
rather important installation and William and I went from the decision 
to make a Sage release to having a handful of bugs fixed and tested that 
they worked to the final release tarball in less than 12 hours.

And one more thing about trivial and obvious fixes not needing review: 
Those are some times the worst and hardest bugs to fix in the right way. 
Too often people have attempted to "fix" things in Sage not 
understanding the underlying implementation details and so on. We have 
been burned by "trivial" one line fixes often enough to develop a lot of 
skepticism that without testing anything can be verified to be fixed. 
One example here is one of the bugs that originally lead to the policy 
decision to mandated the coverage policy: We had a bug in an external 
library where a characteristic polynomial for a matrix over the integers 
was wrong.

  * First try: Fix the issue in the library and update it in Sage

Oops, forgot the fix, *nobody* noticed since no doctest in Sage was 
added that did check for the bug, let's redo it.

  * Second try: Fix the issue in the library and update it in Sage, add 
doctest

Testing the build revealed that everything worked fine. A couple days 
later we cut an rc release, but behold: doctesting reveals that on a 
big-endian PPC box the fix in the upstream library was *completely* broken.

So the third try was the charm. Without mandated tests for fixes we 
would have caught this bug at some point for sure, but it would have 
likely been way more work to determine which change caused the problem 
since the side effects of broken code in that specific case can be at 
far places in the codebase.

I am not asking anyone to agree with me, I just wanted to point out 
things the Sage project has been doing and share some experience I have 
collected as the Sage release manager. There is no one-size fits all for 
a project and project specific culture as well as customs are hard to 
change, but it can be done. I believe that the Sage project has greatly 
benefited from the policy changes described above. Now you need to come 
to your own conclusions :)

> Regards
> Stéfan

Cheers,

Michael

> _______________________________________________
> Scipy-dev mailing list
> Scipy-dev@scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-dev
> 



More information about the Scipy-dev mailing list