[SciPy-dev] The future of SciPy and its development infrastructure

Stéfan van der Walt stefan@sun.ac...
Mon Feb 23 10:04:36 CST 2009


*[If you only have 30 seconds to read this email, read the **bold text only]
*

*Dear* SciPy *developer*s

The past while has seen a rocky ride with the SciPy servers, but yesterday
Peter Wang announced that he is attending to the situation.  This, then,
seems like the perfect time to *stand back and take a look at our
infrastructure*, and whether we should continue with the current setup.

To put this conversation into context, we have to face the facts: SciPy has
a large user community relative to the number of developers.  A big library
of code, used by many scientists, is supported by a small handful of people
all over the world.  *We cannot afford* *a high barrier to contribution*,
and we have to lower the effort it takes for a developer to merge
contributed code.

*I'd like to propose two changes* to the status quo:

1. *Change to a distributed revision control system*, encouraging more open
collaboration.
2. *Determine guidelines for code acceptance*, in terms of unit tests,
documentation and peer review.

Allow me to motivate these changes, and then suggest practical approaches
for their implementation:

Subversion allows only a selected group of developers to change the SciPy
source code.  This does not encourage a culture of meritocracy, but worse,
has practical implications, in that users cannot merge their own patches.  I
won't discuss the advantages of distributed revision control here, but note
that it shifts responsibility from the current core developers to
contributers; *that benefits us all!*

This ties in with my second point: code review.  The current developers have
access to SVN because they are experienced programmers with knowledge of
SciPy's scientific domains of application.  We are unable to employ this
scarce resource fully, because it simply takes too long to merge a patch
from Trac, review it, *bring it up to scratch*, and commit it.  *We have to
put a system in place which allows contributers to take responsibility for
their own patches, and for core developers to guide and advise during this
process.*  As it is, we have many patches waiting on Trac for up to a year
or more without any feedback; that is not acceptable.

My view on testing is simple: *untested code is probably broken code* (and I
can show examples from the past year's commit logs to corroborate this
statement).  *As for documentation, we cannot afford to be without it.
*
Implementation:

Enthought generously hosts SciPy, and I hope they will continue doing so.
New software will need to be installed on the server, but we have many hands
willing to tackle that task: David Cournapeau and myself included.  Before
deploying to scipy.org, *we will configure a *different* server as a proof
of concept.*

1) *Distributed revision control system: David Cournapeau and myself have
been test driving Git [1] on SciPy and NumPy for a while.  It is fast, well
supported, has great branch support, and is simple to use for the average
contributor, while allowing powerful patch-carving for the more adventurous.
*

2) *Ticketing back-end:* David is exploring RedMine [2], and I'd like to
take a look at InDefero [3], but *we'll do a careful analysis* of trac-git
(like FedoraHosted) too.

Thank you for taking the time to deliberate on SciPy's future.  I would love
to hear your comments.

Kind regards
Stéfan

[1] http://git.or.cz/course/svn.html
[2] http://www.redmine.org/
[3] http://scipy.indefero.net/p/numpy/
[4] http://fedorahosted.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/scipy-dev/attachments/20090223/44ad4a1b/attachment-0001.html 


More information about the Scipy-dev mailing list