[SciPy-Dev] anyone want to fix Mann-Whitney test?

Ralf Gommers ralf.gommers@googlemail....
Sun Feb 5 09:39:29 CST 2012


On Sun, Feb 5, 2012 at 3:49 PM, <josef.pktd@gmail.com> wrote:

> On Sun, Feb 5, 2012 at 9:28 AM, <josef.pktd@gmail.com> wrote:
> >
> >
> >
> > On Sun, Feb 5, 2012 at 8:28 AM, Ralf Gommers <
> ralf.gommers@googlemail.com> wrote:
> >>
> >>
> >>
> >> On Sun, Feb 5, 2012 at 1:19 PM, <josef.pktd@gmail.com> wrote:
> >>>
> >>>
> >>>
> >>> On Sun, Feb 5, 2012 at 5:17 AM, Ralf Gommers <
> ralf.gommers@googlemail.com> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> There's a bug report and a number of new tests for mannwhitneyu at
> http://projects.scipy.org/scipy/ticket/1593. These plus a fix were
> contributed by Sebastian Pölsterl, unfortunately he based his initial fix
> on GPL'ed R code. Therefore I think we can't use that, even after he
> modified it. I looked at the GPL code too; I think we need someone who
> didn't do that to implement a new fix based only on the tests and bug
> report.
> >>>>
> >>>> Any takers?
> >>>
> >>>
> >>> From what I remember my impression is that this is only a "cosmetic"
> change, or better a change in what is returned.
> >>>
> >>> >>> v, pval = stats.mannwhitneyu(x, y)
> >>> >>> len(x)*len(y) - v
> >>> 498.0
> >>
> >>
> >> Ah, okay. I'm not sure if this is a desirable change then. Any idea why
> it was implemented like this?
> >
> >
> > No, I was just fixing bugs. This was one of the early tests I worked on
> when I didn't have stronger opinions what the standard or more informative
> returns are. Since the pvalues are correct, I didn't care too much about
> which test statistic is reported.
> >
> > Looking a bit closer, I'm in favor of the change. Returning the short
> tail instead of the asked for tail in a one-sided test is not really
> "clean", and trying to rewrite this, it's not easy to figure out which is
> which, 210 or 498. I haven't finished yet. I like requests with a full test
> suite.
> >
> > If I remember correctly, then we return almost all the time the
> two-sided test, so adding the option for one-sided test will be backwards
> compatible, but for mannwhitneyu it might not be possible.
>
> rewrite as a standalone function is attached
>

Looks good, thanks. I updated the docstring and put it at
https://github.com/rgommers/scipy/tree/mannwhitneyu-tests.

>
> the last test was missing a self
>
> And I initially had a test failure, because I preferred the keyword
> arguments in reversed sequence and the tests use a keyword arguments
> as positional argument.
>

Users can do that too, so you should never insert new keywords in the
middle.


> Also just tried to match the tests without trying to understand every
> detail again.
>
> I think it would be better if the default is two-sided but this will
> double the reported p-value compared to the current version.
>
> Not worth breaking backwards compatibility for I'd think.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-dev/attachments/20120205/f684d3b6/attachment-0001.html 


More information about the SciPy-Dev mailing list