[SciPy-User] stats.ranksums vs. stats.mannwhitneyu
Nils Kölling
nkoelling@gmail....
Wed Oct 10 07:59:30 CDT 2012
Thank you for your reply, Josef! Is there any reason you are
calculating the test manually in your code instead of using
scipy.stats.kruskal?
I have written my own version for permutation-based p-values using
stats.mannwhitneyu now and ran a few trials. Here is what I get for:
a=8*[0]
b=n*[1]
n = 1 - normal = 0.0133283287808 / permuted = 0.109775608976
n = 2 - normal = 0.00491580235039 / permuted = 0.0232390704372
n = 3 - normal = 0.00244136177941 / permuted = 0.00559977600896
n = 4 - normal = 0.00131365315366 / permuted = 0.00185992560298
n = 5 - normal = 0.000731481991814 / permuted = 0.000719971201152
n = 6 - normal = 0.000414875963454 / permuted = 0.000539978400864
n = 7 - normal = 0.000237996579543 / permuted = 0.00019999200032
n = 8 - normal = 0.000137586057166 / permuted = 0.000159993600256
n = 9 - normal = 7.99851933706e-05 / permuted = 7.9996800128e-05
So if we assume that the permuted p-value is the "true" value, it
seems like one could get away with just using the normal,
non-permutation based version for n >= 5, since the permuted value
does not differ much from the normal one anymore. What do you think?
Cheers
Nils
More information about the SciPy-User
mailing list