[SciPy-user] help with scipy.stats.mannwhitneyu

Sturla Molden sturla@molden...
Thu Feb 5 10:56:27 CST 2009


On 2/5/2009 5:39 PM, josef.pktd@gmail.com wrote:

> According to R:
> wilcox.test(x,y)
> Performs one and two sample Wilcoxon tests on vectors of data; the
> latter is also known as 'Mann-Whitney' test.
> 
> I tried a normal random variable example ( no ties): the test
> statistic returned is exactly the same as the one returned by
> stats.mannwhitneyu(x,y) however the p-values differ. the pvalue in
> stats is half of the one in R (up to 1e-17) as stated in the
> docstring:  one-tailed p-value.


I believe there is a bug in SciPy:


def mannwhitneyu(x, y):
     """Calculates a Mann-Whitney U statistic on the provided scores and
     returns the result.  Use only when the n in each condition is < 20 and
     you have 2 independent samples of ranks.  REMEMBER: Mann-Whitney U is
     significant if the u-obtained is LESS THAN or equal to the critical
     value of U.

     Returns: u-statistic, one-tailed p-value (i.e., p(z(U)))
     """
     x = asarray(x)
     y = asarray(y)
     n1 = len(x)
     n2 = len(y)
     ranked = rankdata(np.concatenate((x,y)))
     rankx = ranked[0:n1]       # get the x-ranks
     #ranky = ranked[n1:]        # the rest are y-ranks
     u1 = n1*n2 + (n1*(n1+1))/2.0 - np.sum(rankx,axis=0)  # calc U for x
     u2 = n1*n2 - u1                            # remainder is U for y
     bigu = max(u1,u2)
     smallu = min(u1,u2)
     T = np.sqrt(tiecorrect(ranked))  # correction factor for tied scores
     if T == 0:
         raise ValueError, 'All numbers are identical in amannwhitneyu'
     sd = np.sqrt(T*n1*n2*(n1+n2+1)/12.0)
     z = abs((bigu-n1*n2/2.0) / sd)  # normal approximation for prob calc
     return smallu, 1.0 - zprob(z)


Take a look at the last two lines? Do you see something peculiar?

Sturla Molden












More information about the SciPy-user mailing list