[SciPy-user] help with scipy.stats.mannwhitneyu
Sturla Molden
sturla@molden...
Thu Feb 5 10:56:27 CST 2009
On 2/5/2009 5:39 PM, josef.pktd@gmail.com wrote:
> According to R:
> wilcox.test(x,y)
> Performs one and two sample Wilcoxon tests on vectors of data; the
> latter is also known as 'Mann-Whitney' test.
>
> I tried a normal random variable example ( no ties): the test
> statistic returned is exactly the same as the one returned by
> stats.mannwhitneyu(x,y) however the p-values differ. the pvalue in
> stats is half of the one in R (up to 1e-17) as stated in the
> docstring: one-tailed p-value.
I believe there is a bug in SciPy:
def mannwhitneyu(x, y):
"""Calculates a Mann-Whitney U statistic on the provided scores and
returns the result. Use only when the n in each condition is < 20 and
you have 2 independent samples of ranks. REMEMBER: Mann-Whitney U is
significant if the u-obtained is LESS THAN or equal to the critical
value of U.
Returns: u-statistic, one-tailed p-value (i.e., p(z(U)))
"""
x = asarray(x)
y = asarray(y)
n1 = len(x)
n2 = len(y)
ranked = rankdata(np.concatenate((x,y)))
rankx = ranked[0:n1] # get the x-ranks
#ranky = ranked[n1:] # the rest are y-ranks
u1 = n1*n2 + (n1*(n1+1))/2.0 - np.sum(rankx,axis=0) # calc U for x
u2 = n1*n2 - u1 # remainder is U for y
bigu = max(u1,u2)
smallu = min(u1,u2)
T = np.sqrt(tiecorrect(ranked)) # correction factor for tied scores
if T == 0:
raise ValueError, 'All numbers are identical in amannwhitneyu'
sd = np.sqrt(T*n1*n2*(n1+n2+1)/12.0)
z = abs((bigu-n1*n2/2.0) / sd) # normal approximation for prob calc
return smallu, 1.0 - zprob(z)
Take a look at the last two lines? Do you see something peculiar?
Sturla Molden
More information about the SciPy-user
mailing list