[SciPy-User] "small data" statistics
Fri Oct 12 09:21:39 CDT 2012
Thanks for the brief review of the frequentist and Bayesian differences
(I'll try to send a few comments in a future post).
The aim of my previous message was definitely more pragmatic
and it boiled down to two questions that stick with Josef's call:
1) In this thread people expressed interest in making hypothesis testing
from small samples, so is permutation test addressing the question of
the accompanying motivating example? In my opinion it is not and I hope I
provided brief but compelling motivation to support this point of view.
2) What are the assumptions under which the permutation test is
valid/acceptable (independently from the accompanying motivating example)?
I have looked around on this topic but I had just found generic desiderata for
all resampling approaches, i.e. that the sample should be "representative"
of the underlying distribution - whatever this means in practical terms.
What's your take on these two questions?
I guess it would be nice to clarify/discuss the motivating questions and the
assumptions in this thread before planning any coding.
On 10/12/2012 01:12 PM, Sturla Molden wrote:
> The "classical statistics" (sometimes called "frequentist") is very
> different and deals with long-run error rates you would get if the
> experiment and data collection are repeated. In this framework is is
> meaningless to speak about p(H_0|data) or p(H_0 a priori), because H_0
> is not considered a random variable. Probabilities can only be assigned
> to random variables.
> To a Bayesian the data are what you got and "the universal truth about
> H0" in unkown. Randomness is the uncertainty about this truth.
> Probability is a measurement of the precision or knowledge about H0.
> Doing the transform p * log2(p) yields the Shannon information in bits.
> Choosing side it is more a matter of religion than science.
More information about the SciPy-User