[Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008

Joe Harrington jh@physics.ucf....
Sat May 17 00:45:32 CDT 2008


As we all know, the state of the numpy and scipy reference
documentation (aka the docstrings) is best described as "incomplete".
Most functions have docstrings shorter than 5 lines, whereas our
competitors IDL and Matlab usually have a concise and well-written
page or two per function.  The (wonderful) categorized list of
functions is very new and isn't included in the package yet.  There
isn't even a "Getting Started"-type of document you can hand a new
user so they can dive right in.  Documentation tools are limited to
plain-text paginators, while our competition enjoys HTML-based
documents with formulae, images, search capability, and cross linking.

Tales of woe abound.  A university class switched to Numpy and got
hopelessly bogged down because students couldn't find out how to call
the functions.  A developer looked something up while giving a
presentation and the words "Blah, Blah, Blah" stared down at the
audience in response.

To head off another pedagogical meltdown, the University of Central
Florida has hired Stefan van der Walt full time to coordinate a
community documentation effort to write reference documentation and
tools.  The project starts now and continues through the summer.  The

1. Produce complete docstrings for all numpy functions and as much of
   scipy as possible,

2. Produce an 8-15 page Getting Started tutorial that is not

3. Write reference sections on topics in numpy, such as slicing and
   the use principles of the modules,

4. Complete a first edition, in both PDF and HTML, of a NumPy
   Reference Manual, and

5. Check everything into the sources by 1 August 2008 so that the
   Packaging Team can cut a release and have it available in time for
   Fall 2008 classes.

Even Stefan could not document the hundreds of functions that need it
by himself, and in any case such a large contribution requires
community review.  To make it easy for everyone to contribute, Pauli
Virtanen and Emmanuelle Guillart have provided a wiki system for
editing reference documentation.  The idea was developed by Fernando
Perez, Stefan, and Gael Varoquaux.  We encourage community members to
write, review, and proofread reference pages on this wiki.  Stefan
will check updates into the sources roughly weekly.  Near the end of
the project, we will put these wiki pages through a vetting process
and then check them into the sources a final time for a release
hopefully to occur in early August.

Meanwhile, Perry Greenfield has taken the lead on on task 3, writing
reference docs for things that currently don't have docstrings, such
as basic concepts like slicing.

We have proposed two small extensions to the current docstring format,
for images (to be used sparingly) and indexing.  These appear in
updated versions of the doc standard, which are linked from the wiki
frontpage.  Please take a look and comment on these if you like.  All
docstrings will remain readable in plain text, but we are now
generating a full reference guide in PDF and HTML (you guessed it,
linked from the wiki).  These are searchable formats.

There are several ways you can help:

1. Write some docstrings on the wiki!  Many people can do this, many
more than can write code for the package itself.  However, you must
know numpy, the function group, and the function you are writing well.
You should be familiar with the concept of a reference page and write
in that concise style.  We'll do tutorial docs in another project at a
later date.  See the instructions on the wiki for guidelines and

2. Review others' docstrings and leave comments on their wiki pages.

3. Proofread docstrings.  Make sure they are correct, complete, and
concise.  Fix grammar.

4. Write examples ("doctests").  Even if you are not a top-notch
English writer, you can help by producing a code snippet of a few
lines that demonstrates a function.  It is fine for them to go into
the docstring templates before the actual text.

5. Write a new help function that optionally produces ASCII or points
the user's PDF or HTML reader to the right page (either local or

6. If you are in a position to hire someone, such as a knowledgeable
student or short-term consultant, hire them to work on the tasks above
for the summer.  We can provide supervision to them or guidance to you
if you like.

The home for this project is here:


This is not a sprint.  It is a marathon, and this time we are going to
finish.  We hope you will join us!

--jh-- and Stefan and Perry and Pauli and Emmanuelle...and you!
Joe Harrington
Stefan van der Walt
Perry Greenfield
Pauli Virtanen
Emmanuelle Guillart
...and you!

More information about the Numpy-discussion mailing list