[Scipy-tickets] [SciPy] #1489: fisher_exact throws ValueErrors when row or col is 0, 0
SciPy Trac
scipy-tickets@scipy....
Sun Aug 7 16:37:44 CDT 2011
#1489: fisher_exact throws ValueErrors when row or col is 0,0
--------------------------+-------------------------------------------------
Reporter: aihardin | Owner: somebody
Type: defect | Status: new
Priority: normal | Milestone: Unscheduled
Component: scipy.stats | Version: 0.9.0
Keywords: fisher_exact |
--------------------------+-------------------------------------------------
Comment(by josefpktd):
I don't think changing the numbers to make them non-zero is good, it would
return a answer to a different table.
R reports pvalue=1 oddsratio=0, when there is a row or column of zeros on
2x2
{{{
> fisher.test(x, alternative = "tw")
Fisher's Exact Test for Count Data
data: x
p-value = 1
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
0 Inf
sample estimates:
odds ratio
0
}}}
this doesn't make much sense to me, there seems to be a conflict by saying
the p-value is 1, but the confidence interval says, the odds ratio could
be anything.
I still don't have enough intuition about the Fisher exact test for this.
I would raise a ValueError.
However, this is a bit similar to the discussion on what to return in the
ttest when the variances are zero, there I decided to return a partially
arbitrary value to avoid nans, 0/0=?. But I don't know what the p-value
should be.
The problem I have with this is that Fisher's exact test is conditional on
the marginals, which in this case conditions on an empty set. Which in my
interpretations would mean that we don't have any information in our
sample and we should raise a ValueError.
If this were an unconditional test, then zero rows or columns would be
strong indication for independence (we always get the same value,
independent of what the other variable is) and the p-value should be large
(or 1).
I vote for ValueError, but if users of fisher_exact have a strong opinion
about a default pvalue for the zero row or column case, it would also be
fine with me.
--
Ticket URL: <http://projects.scipy.org/scipy/ticket/1489#comment:4>
SciPy <http://www.scipy.org>
SciPy is open-source software for mathematics, science, and engineering.
More information about the Scipy-tickets
mailing list