[SciPy-user] Docstring standards for NumPy and SciPy

Robert Kern robert.kern at gmail.com
Wed Jan 10 14:31:29 CST 2007


Travis Oliphant wrote:
> Robert Kern wrote:
> 
>>> """
>>> one-line summary not using variable names or the function name
>>>
>>> A few sentences giving an extended description.
>>>
>>> Inputs:
>>> var1      -- Explanation
>>> variable2 -- Explanation
>>>
>>> Outputs:  named, list, of, outputs
>>> named   -- explanation
>>> list    -- more detail
>>> of      -- blah, blah.
>>> outputs -- even more blah
>>>    
>>>
>> My objection to this kind of list is that it uses up a lot of valuable
>> horizontal space. Additionally, it either entails annoying column alignment *or*
>> it looks really ugly with all of the descriptions unaligned. Look at most of the
>> docstrings in scipy for examples. This is the Enthought standard:
>>
> I'm not sure I understand what is taking up horizontal space (do you 
> mean the required alighment)?
> 
> I do like aligned descriptions.

The "named    -- " part is taking up horizontal space that should be used by the
explanation part. It's a better use of both horizontal and vertical space, IMO,
to put the parameter on its own line and follow it with a description. Longish
parameter names make this kind of formatting prohibitive (and the other way
around!), and I would like to encourage longer, more understandable names.

>> Parameters
>> ----------
>> var1 : type of var1 (loosely)
>>    Description of var1.
>> variable2 : type of variable2 (loosely)
>>    Description of variable2.
>> kwdarg : type of kwdarg, optional
>>    Description of kwdarg.
>>
> This is acceptable in my mind, also. Although I probably like '-' 
> instead of ':' as a separator, but just because I'm used to it.
> 
>> The (loose) type information is quite handy. Sometimes it is quite difficult to
>> tell what kind of thing the function needs from the usual description,
>> particularly when both scalars or arrays are flying around.
>>  
> Yeah, I could see that.  As you understand but others may not,  we 
> shouldn't get overly specific with the "types" though.  We're supposed 
> to be doing duck typing wherever possible.

Well, there's certainly no pointless type-*checking* going on. It's just that
when I say I want an int, for example, I'm going to be treating what you give me
as an int whatever it happens to be, so it better quack like an int. I'm just
explicitly stating that precondition which already existed and give users a much
clearer idea of how they are expected to use the function. It guides the user to
doing the sensible, expected thing without preventing them from sensible, but
unexpected things.

>> The problem with both of these forms is that they are difficult to parse into
>> correct lists. One issue that should be noted is that sometimes the description
>> of a variable is omitted because the description of the function, the variable
>> name, and the type information are more than enough context to tell you
>> everything you need to know about the parameter. The only description you can
>> give is just a repetition of the information that you just gave. Or two
>> variables are substantially similar such that you want to describe them together
>> only once.
>>
>> """ Clip the values of an array.
>>
>> Parameters
>> ----------
>> arr : array
>> lower : number
>> upper : number
>>    The lower and upper bounds to clip the array against.
>> """
>>
>> This is one reason why I'm beginning to like the ReST form of this; it always
>> has a marker to say that this is part of a list.
> 
> I don't see why indentation doesn't give you that information already.
> 
>> """ Clip the values of an array.
>>
>> :Parameters:
>>  - arr : array
>>  - lower : number
>>  - upper : number
>>    The lower and upper bounds to clip the array against.
>> """
>>
>> I don't think that you can write a tool that takes the ambiguous forms to the
>> unambiguous ones.
>>
> I don't know.  If you require indentation for non list entries, then it 
> seems un-ambiguous to me.

Hmm. Point. reST didn't when I was playing with it in epydoc (although it
handles the full form where all list elements have a description just fine). I
overgeneralized.

>>> Additional Inputs:    kwdarg1 -- A little-used input not always needed.
>>> kwdarg2 -- Some keyword arguments can and should be given in Inputs
>>>            Section.  This is just for "little-used" inputs.
>>>    
>> These should be in the Inputs section, not in a separate section.
>>  
> I don't like this as a requirement because occassionally this clutters 
> the inputs list with a lot of inputs that are rarely used.  The common 
> usage should be listed.  I'd like there to be an option for an 
> Additional Inputs section.
> 
> See some of the functions in scipy.optimize for examples.

I can live with that.

>>> Algorithm:
>>> Notes about the implemenation algorithm (if needed).
>>>
>>> This can have multiple paragraphs as can all sections.
>>
>> Meh. This should be in the multi-line description at the top.
>>  
> I don't know.  Not always.  Sometimes sure, but the algorithm can be 
> quite complicated to explain and should therefore be here.   Perhaps we 
> should use Algorithm Notes for this section so that it is clear that 
> simple algorithms should be explained in the multi-line description part.

I'm happy to also allow any arbitrary, unstructured section the author feels is
appropriate. But I feel that the minimal standard that we promote should exclude it.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


More information about the SciPy-user mailing list