Chris Rodgers xrodgers@gmail....
Thu Feb 14 18:06:11 CST 2013

```Hi all

I use scipy.stats.mannwhitneyu extensively because my data is not at
all normal. I have run into a few "gotchas" with this function and I
wanted to discuss possible workarounds with the list.

1) When this function returns a significant result, it is non-trivial
to determine the direction of the effect! The Mann-Whitney test is NOT
a test on difference of medians or means, so you cannot determine the
direction from these statistics. Wikipedia has a good example of why
it is not a test for difference of median.
http://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U#Illustration_of_object_of_test

I've reprinted it here. The data are the finishing order of hares and
tortoises. Obviously this is contrived but it indicates the problem.
First the setup:
results_l = 'H H H H H H H H H T T T T T T T T T T H H H H H H H H H H
T T T T T T T T T'.split(' ')
h = [i for i in range(len(results_l)) if results_l[i] == 'H']
t = [i for i in range(len(results_l)) if results_l[i] == 'T']

And the results:
In [12]: scipy.stats.mannwhitneyu(h, t)
Out[12]: (100.0, 0.0097565768849708391)

In [13]: np.median(h), np.median(t)
Out[13]: (19.0, 18.0)

Hares are significantly faster than tortoises, but we cannot determine
this from the output of mannwhitneyu. This could be fixed by either
returning u1 and u2 from the guts of the function, or testing them in
the function and returning the comparison. My current workaround is
testing the means which is absolutely wrong in theory but usually
correct in practice.

2) The documentation states that the sample sizes must be at least 20.
I think this is because the normal approximation for U is not valid
for smaller sample sizes. Is there a table of critical values for U in
scipy.stats that is appropriate for small sample sizes or should the
user implement his or her own?

3) This is picky but is there a reason that it returns a one-tailed
p-value, while other tests (eg ttest_*) default to two-tailed?

Thanks for any thoughts, tips, or corrections and please don't take
these comments as criticisms ... if I didn't enjoy using scipy.stats
so much I wouldn't bother bringing this up!

Chris
```