[SciPy-User] Central File Exchange for SciPy

Nathaniel Smith njs@pobox....
Sat Oct 30 19:43:39 CDT 2010

On Sat, Oct 30, 2010 at 3:22 PM, Fernando Perez <fperez.net@gmail.com> wrote:
> Just a few comments from the sidelines... I think it would be really
> great if every snippet had an automatic version control history
> associated with it.  For me, the gist model at github is perfect in
> this regard.  Consider for example (random gist I found that had numpy
> in it):
> http://gist.github.com/364369
> This very simple page has all the code, a download button, space for
> comments, revision history and a 'fork' button.  The last two for me
> are very, very important: they plant the seed that allows a simple
> script to very easily grow into something larger.  The author has an
> easy way to make improvements and track those (with near-zero setup
> overhead), and the 'fork' button makes it easy for others to
> contribute.

gist.github.com is *really* slick, but... I'm guessing it wouldn't be
so easy to reimplement for someone who hasn't just implemented github?
And it seems to me that the sort of people who use git (i.e., people
with a substantial investment of time and mental energy in "real
programming") are already pretty well supported by existing
infrastructure. I'm not going to be working on this either, so this is
also from the peanut gallery, but... if I *were* doing this project,
my focus would be on achieving exactly two things as quickly as

1) A minimum ceremony way for your average scientific programmer to
get some useful code they wrote online. Maximum five steps (or fewer
would be better!): a) log-in, b) type some text about what the snippet
does, c) check the box saying yeah they understand what BSD means, d)
paste in the code, e) hit submit. Maybe there should be some extra
optional steps for richer metadata or whatever, but srsly, you cannot
make "understand the GPL" or "know what git is" or "fill out this
complicated form to specify tags in our obscure Trove ontology"
prerequisites for scientific programmers to contribute.

2) Solid one-stop-shopping support for scientific code. (If you do
this right, then everyone will use the site, and then it's what
they'll think of when they have something useful to upload!) That
means, a good search function for all the snippets that have been
uploaded. It also means the search function needs to know about
"proper" packages -- searching for "wavelets" should find pywt, etc.
I'm not sure if that's best done by searching pypi directly, or by
having people explicitly enter pointers to scientific software into
the database -- I'd probably do the latter because it's both quicker
to implement and would keep the search results much more focused. And
for real one-stop-shopping, searches should be able to find functions
embedded inside larger packages (so e.g. searching for matrix
exponential should give you a hit on scipy.linalg.expm). I guess this
means, index the documentation for at least numpy and scipy, and maybe
the docs for other packages as they get added?

Obviously there are lots of enhancements one can imagine -- tracking
of multiple versions of the same snippet, discussions, syntax
highlighting, finding related snippets, git support, etc. etc., and
there are lots more ideas in this thread -- but I'd start by lasering
in on those two features, work hard on making the fundamentals as
useful as possible, and then build up from there.

Hope that's useful,
-- Nathaniel

More information about the SciPy-User mailing list