[SciPy-Dev] anyone want to fix Mann-Whitney test?

josef.pktd@gmai... josef.pktd@gmai...
Sun Feb 5 10:14:09 CST 2012


On Sun, Feb 5, 2012 at 10:39 AM, Ralf Gommers
<ralf.gommers@googlemail.com> wrote:
>
>
> On Sun, Feb 5, 2012 at 3:49 PM, <josef.pktd@gmail.com> wrote:
>>
>> On Sun, Feb 5, 2012 at 9:28 AM, <josef.pktd@gmail.com> wrote:
>> >
>> >
>> >
>> > On Sun, Feb 5, 2012 at 8:28 AM, Ralf Gommers
>> > <ralf.gommers@googlemail.com> wrote:
>> >>
>> >>
>> >>
>> >> On Sun, Feb 5, 2012 at 1:19 PM, <josef.pktd@gmail.com> wrote:
>> >>>
>> >>>
>> >>>
>> >>> On Sun, Feb 5, 2012 at 5:17 AM, Ralf Gommers
>> >>> <ralf.gommers@googlemail.com> wrote:
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> There's a bug report and a number of new tests for mannwhitneyu at
>> >>>> http://projects.scipy.org/scipy/ticket/1593. These plus a fix were
>> >>>> contributed by Sebastian Pölsterl, unfortunately he based his initial fix on
>> >>>> GPL'ed R code. Therefore I think we can't use that, even after he modified
>> >>>> it. I looked at the GPL code too; I think we need someone who didn't do that
>> >>>> to implement a new fix based only on the tests and bug report.
>> >>>>
>> >>>> Any takers?
>> >>>
>> >>>
>> >>> From what I remember my impression is that this is only a "cosmetic"
>> >>> change, or better a change in what is returned.
>> >>>
>> >>> >>> v, pval = stats.mannwhitneyu(x, y)
>> >>> >>> len(x)*len(y) - v
>> >>> 498.0
>> >>
>> >>
>> >> Ah, okay. I'm not sure if this is a desirable change then. Any idea why
>> >> it was implemented like this?
>> >
>> >
>> > No, I was just fixing bugs. This was one of the early tests I worked on
>> > when I didn't have stronger opinions what the standard or more informative
>> > returns are. Since the pvalues are correct, I didn't care too much about
>> > which test statistic is reported.
>> >
>> > Looking a bit closer, I'm in favor of the change. Returning the short
>> > tail instead of the asked for tail in a one-sided test is not really
>> > "clean", and trying to rewrite this, it's not easy to figure out which is
>> > which, 210 or 498. I haven't finished yet. I like requests with a full test
>> > suite.
>> >
>> > If I remember correctly, then we return almost all the time the
>> > two-sided test, so adding the option for one-sided test will be backwards
>> > compatible, but for mannwhitneyu it might not be possible.
>>
>> rewrite as a standalone function is attached
>
>
> Looks good, thanks. I updated the docstring and put it at
> https://github.com/rgommers/scipy/tree/mannwhitneyu-tests.

I had managed to work with git, it just takes some time (and I cannot
test with current scipy)
https://github.com/josef-pkt/scipy/commit/30aa361fc76dea7f7fd76c3f4f7babcd288f7c01

a bit more streamlining and I increased significant to 14

>>
>>
>> the last test was missing a self
>>
>> And I initially had a test failure, because I preferred the keyword
>> arguments in reversed sequence and the tests use a keyword arguments
>> as positional argument.
>
>
> Users can do that too, so you should never insert new keywords in the
> middle.
>
>>
>> Also just tried to match the tests without trying to understand every
>> detail again.
>>
>> I think it would be better if the default is two-sided but this will
>> double the reported p-value compared to the current version.
>>
> Not worth breaking backwards compatibility for I'd think.

I think, there is no option in this version that would be backwards
compatible, since the old version calculated the two-sided test
statistic but reported only pvalue/2.
The old version was "correct" only with the right interpretation or usage.

Since the results will be difficult, I would have really broken with
the old version and used the nicer ordering of keywords.

If we need to deprecate and break backwards compatibility, then it
might be worth to review the entire group of rank tests again. There
is at least one ticket on this, also with the comparison of various
options and results with matlab and R.

Josef

>
> Ralf
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>


More information about the SciPy-Dev mailing list