[SciPy-user] information on statistical functions
josef.pktd@gmai...
josef.pktd@gmai...
Wed Dec 17 20:32:47 CST 2008
On Wed, Dec 17, 2008 at 9:03 PM, Robert Kern <robert.kern@gmail.com> wrote:
> On Wed, Dec 17, 2008 at 19:53, <josef.pktd@gmail.com> wrote:
>> On Wed, Dec 17, 2008 at 7:58 PM, Tim Michelsen
>> <timmichelsen@gmx-topmail.de> wrote:
>>> Hello,
>>> I observed that there are 2 standard deviation functions in the
>>> scipy/numpy modules:
>>>
>>> Numpy:
>>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.std.html#numpy.std
>>>
>>> Scipy:
>>> http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.std.html#scipy.stats.std
>>>
>>> What is the difference?
>>> There is no formula included within the docstrings.
>>>
>>> I suppose that np.std() is for the whole population and scipy.std is
>>> designed for a smaller sample in the population.
>>> Is that true?
>>
>> difference between population (numpy) and sample (scipy.stats)
>> variance and standard deviation is whether the the estimator is
>> biased, i.e. 1/n, or not, i.e. 1/(n-1).
>
> It's a shame that the "biased/unbiased" terminology still survives in
> the numpy.std() docstring. It's really quite wrong.
>
I find talking about biased versus unbiased estimator much clearer
than the population - sample distinction, and degrees of freedom might
be more descriptive but its meaning, I guess, relies on knowing about
the (asymptotic) distribution of the estimator, which I always forget
and have to look up.
Josef
