[SciPy-Dev] stats.distributions.py documentation

Ralf Gommers ralf.gommers@gmail....
Fri Sep 21 14:27:14 CDT 2012


On Sun, Sep 16, 2012 at 11:10 PM, nicky van foreest <vanforeest@gmail.com>wrote:

> Hi,
>
> Below are two proposals to handle the documentation of the scipy
> distributions.
>
> The first is to add a set of examples to each distribution, see the
> list at the end of the mail as an example. However, I actually wonder
> whether it wouldn't be better to put this stuff in the stats tutorial.
> (I recently updated this, but given the list below, it is still not
> complete.) The list below is a bit long... too long perhaps.
>
> I actualy get the feeling that, given the enormous diversity of the
> distributions, it may not be possible to automatically generate a set
> of simple examples that work for each and every distributions. Such
> examples then would involve the usage of x.dist.b, and so on, and this
> is not particularly revealing to first (and second) time users.
>

This is exactly what the problem is currently.


> A possible resolution is to include just one or two generic examples
> in the example doc string (e.g., dist.rvs(size = (2,3)) ), and refer
> to the tutorial for the rest. The tutorial then should show extensive
> examples for each method of the norm distribution. I assume that then
> any user of other distributions can figure out how to proceed for
> his/her own distribution.
>

This is a huge amount of work, and the generic example still won't run if
you copy-paste it into a terminal.


>
> The second possibility would be to follow Josef's suggestion:
> --snip snip
> Splitting up the distributions pdf docs in tutorial into separate
> pages for individual distributions, make them nicer with code and
> graphs and link them from the docstring of the distribution.
>

Linking to the tutorial from the docstrings is a good idea, but the
docstrings themselves should be enough to get started.

>
> This would keep the docstring itself from blowing up, but we could get
> the full html reference if we need to.
>
> --snip snip
>
> This idea offers a lot of opportunities. In a previous mail I
> mentioned that I don't quite like that the documentation is spread
> over multiple documents. There are doc strings in distributions.py
> (leading to a bloated file),


It's not that bad imho. The typical docstring looks like:
"""A beta prima continuous random variable.

    %(before_notes)s

    Notes
    -----
    The probability density function for `betaprime` is::

        betaprime.pdf(x, a, b) =
            gamma(a+b) / (gamma(a)*gamma(b)) * x**(a-1) * (1-x)**(-a-b)

    for ``x > 0``, ``a > 0``, ``b > 0``.

    %(example)s
"""

It can't be much shorter than that.


and there is continuous.rst. Part of the
> implementation can be understood from the doc-string, typically, the
> density function, but not the rest;


The pdf and support are given, that's enough to define the distribution. So
that should stay. It doesn't mean we have to copy the whole wikipedia page
for each distribution.


> this requires continuous.rst.
> Besides this, in case some specific distribution requires extra
> explanation/examples, this will have to put in the doc-string, making
> distributions.py longer still. Thus, to take up Josef's suggestion,
> what about a documentation file organised like this:
>

Are you suggesting a reST page here, or a .py file with only docs, and new
magic to make part of the content show up as docstring? The former sounds
better to me.


>
> # some tag to tell that these are the docs for the norm distribution
> # eg.
> # norm_gen
>
> Normal Distribution
> ----------------------------
>
> Notes
> ^^^^^^^
> # should be used by the interpreter
> The probability density function for `norm` is::
>
>        norm.pdf(x) = exp(-x**2/2)/sqrt(2*pi)
>
> Simple Examples
> ^^^^^^^^^^^^^^^^^^^^
> # used for by interpreter
>      >>> norm.rvs( size = (2,3) )
>
> Extensive Examples
> ^^^^^^^^^^^^^^^^^^^^^^^^
> # Not used by the interpreter, but certainly by a html viewer,
> containing graphs, hard/specific examples.
>
> Mathematical Details
> ^^^^^^^^^^^^^^^^^^^^^^
>
> Stuff from continuous.rst
>
> # dist2_gen
> Distribution number 2
> -----------------------------------------
> etc
>
> It shouldn't be too hard to parse such a document, and couple each
> piece of documentation to a distribution in distributions.py (or am I
> mistaken?) as we use the class name as the tag in the documentation
> file. The doc-string for a distribution in distributions.py can then
> be removed,
>
> Nicky
>
> Example for the examples section of the docstring of norm.
>

This example is good. Perhaps the frozen distribution needs a few words of
explanation. I suggest to do a few more of these for common distributions,
and link to the norm() docstring from less common distributions. Other than
that, I wouldn't change anything about the docstrings. Built docs could be
reworked more thoroughly.

Ralf



>
>     Notes
>     -----
>     The probability density function for `norm` is::
>
>         norm.pdf(x) = exp(-x**2/2)/sqrt(2*pi)
>
>     #%(example)s
>
>     Examples
>     --------
>
>     Setting the mean and standard deviation:
>
>         >>> from scipy.stats import norm
>         >>> norm.cdf(0.0)
>         >>> norm.cdf(0., 1) # set mu = loc = 1
>         >>> norm.cdf(0., 1, 2) # mu = loc = 1, scale = sigma = 2
>         >>> norm.cdf(0., loc = 1,  scale = 2) # mu = loc = 1, scale =
> sigma = 2
>
>     Frozen rvs
>
>         >>> norm(1., 2.).cdf(0)
>         >>> x = norm(scale = 2.)
>         >>> x.cdf(0.0)
>
>     Moments
>
>         >>> norm(loc = 2).stats()
>         >>> norm.mean()
>         >>> norm.moment(2, scale = 3.)
>         >>> x.std()
>         >>> x.var()
>
>     Random number generation
>
>         >>> norm.rvs(3, 1, size = (2,3)) # loc = 3, scale =1, array of
> shape (2,3)
>         >>> norm.rvs(3, 1, size = [2,3])
>         >>> x.rvs(3)     # array with 3 random deviates
>         >>> x.rvs([3,4]) # array of shape (3,4) with deviates
>
>     Expectations
>
>         >>> norm.expect(lambda x: x, loc = 1) # 1.00000
>         >>> norm.expect(lambda x: x**2, loc = 1., scale = 2.) # second
> moment
>
>     Support of the distribution
>
>         >>> norm.a # left limit, -np.inf here
>         >>> norm.b # right limit, np.inf here
>
>     Plot of the cdf
>
>         >>> import numpy as np
>         >>> x = np.linspace(0, 3)
>         >>> P = norm.cdf(x)
>         >>> plt.plot(x,P)
>         >>> plt.show()
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-dev/attachments/20120921/812922be/attachment.html 


More information about the SciPy-Dev mailing list