[SciPy-User] "small data" statistics

josef.pktd@gmai... josef.pktd@gmai...
Fri Oct 12 13:14:01 CDT 2012

On Fri, Oct 12, 2012 at 10:21 AM, Emanuele Olivetti
<emanuele@relativita.com> wrote:
> Hi Sturla,
> Thanks for the brief review of the frequentist and Bayesian differences
> (I'll try to send a few comments in a future post).
> The aim of my previous message was definitely more pragmatic
> and it boiled down to two questions that stick with Josef's call:

My aim is even more practical:

If everyone else has it, and it's useful, then let's do it in Python.

as for mannwhineyu this would mean tables for very small samples
exact permutation for the next higher, and random permutation
for medium sample sizes.

(and advertise empirical likelihood in statsmodels)

and for other cases (somewhere in the future) bias correction
and higher order expansions of the distribution of the test
statistics or estimates.


(Limitation: There are too many things for "let's make it available in python".)

> 1) In this thread people expressed interest in making hypothesis testing
> from small samples, so is permutation test addressing the question of
> the accompanying motivating example? In my opinion it is not and I hope I
> provided brief but compelling motivation to support this point of view.

I got two questions "wrong" in the survey. And had to struggle with
several of these
(especially because I was implicitly adding "if the Null is true" to
some of the statements.)
I find the "at least one wrong answer" graph misleading compared to
the break down
by question.

Under the assumptions of the tests and the permutation distribution, I think
the permutation tests answer the question whether there are statistically
significant differences (in means, medians, distributions) across samples.
But it's in the classical statistical test tradition.

consistency of test, ...

> 2) What are the assumptions under which the permutation test is
> valid/acceptable (independently from the accompanying motivating example)?
> I have looked around on this topic but I had just found generic desiderata for
> all resampling approaches, i.e. that the sample should be "representative"
> of the underlying distribution - whatever this means in practical terms.

I collected a few papers, but haven't read them yet or only partially


One problem is that all tests rely on assumptions and with small
samples there is not enough information to tests the underlying
assumptions or to switch to something that requires even
weaker assumptions and still have power.

For example my small Monte Carlo with mannwhitneyu:
Difference between permutation pvalues and large sample normal
distribution p-values is not large. I saw one recommendation that
7 observations for each sample is enough. One reference says the
extreme tail probabilities are inaccurate.

With only a few observations, the power of the test is very low and
only detects large differences.

If the distributions of the observations are symmetric and the
sample size is the same, then both permutation and normal
pvalues are correctly sized (close to 0.05 under the null) even
if the underlying distributions are different (t(2) versus normal).

If the sample sizes are unequal then differences in the
distributions, causes a bias in the test, under- or over-rejecting.

>From the references it sounds like that if the distributions are
skewed, then the tests are also incorrectly sized.

The main problem I have in terms of interpretation is that we
are in many cases not really estimating a mean or median
shift, but more likely stochastic dominance.
Under one condition the distribution has "higher" values
then under the other condition, where "higher" could mean
mean-shift or just some higher quantiles (more weight on
larger values).

Thanks for the comments.


> What's your take on these two questions?
> I guess it would be nice to clarify/discuss the motivating questions and the
> assumptions in this thread before planning any coding.
> Best,
> Emanuele
> On 10/12/2012 01:12 PM, Sturla Molden wrote:
>> [...]
>> The "classical statistics" (sometimes called "frequentist") is very
>> different and deals with long-run error rates you would get if the
>> experiment and data collection are repeated. In this framework is is
>> meaningless to speak about p(H_0|data) or p(H_0 a priori), because H_0
>> is not considered a random variable. Probabilities can only be assigned
>> to random variables.
>> [...]
>> To a Bayesian the data are what you got and "the universal truth about
>> H0" in unkown. Randomness is the uncertainty about this truth.
>> Probability is a measurement of the precision or knowledge about H0.
>> Doing the transform p * log2(p) yields the Shannon information in bits.
>> [...]
>> Choosing side it is more a matter of religion than science.
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

More information about the SciPy-User mailing list