[SciPy-user] help with scipy.stats.mannwhitneyu

josef.pktd@gmai... josef.pktd@gmai...
Thu Feb 5 18:03:34 CST 2009


On Thu, Feb 5, 2009 at 3:54 PM,  <josef.pktd@gmail.com> wrote:
>>
>> sample size 20, 9 ties
>> this is with R wilcox.exact, ranksums is your ranksum
> ...
>>
>> With this correction, the normal distribution based p-value in
>> ranksums looks exactly the same as stats.mannwhitneyu.
>
> this statement is not correct.
>
> I mixed up my variables and didn't actually have ties, now with ties,
> I still get essentially but not exactly the same results.
>

I think there is a mistake in the tie handling of stats.mannwhitneyu
In the calculation of the standard error the sqrt is taken twice.

    T = np.sqrt(tiecorrect(ranked))  # correction factor for tied scores
    if T == 0:
        raise ValueError, 'All numbers are identical in amannwhitneyu'
    sd = np.sqrt(T*n1*n2*(n1+n2+1)/12.0)

I don't have the formulas for the tie correction, but from looking at
the tie correction
in Sturlas version of ranksums, it seems that the first sqrt shouldn't be there.

Can someone with access to the correct references verify this.

Josef


More information about the SciPy-user mailing list