[SciPy-User] ttest returning NaN for 0/0 zero variance?
josef.pktd@gmai...
josef.pktd@gmai...
Sun Jun 3 06:25:01 CDT 2012
Should we stop guessing what the ttest is with zero variance, and
switch to returning NaN in these cases?
http://article.gmane.org/gmane.comp.python.scientific.devel/9622
Initially, IIRC, I tried to avoid the nan, because there was
discussion somewhere, like pymvpa mailing list or code, that didn't
want a NaN in the results.
The question whether 0/0=0 or 0/0=1 in the ttests, comes up about once a year.
Since I finally have R open again (reduced output):
almost the same
> t.test(c(0,0,1e-15), c(0,0,0),var.equal=TRUE)
Two Sample t-test
data: c(0, 0, 1e-15) and c(0, 0, 0)
t = 1, df = 4, p-value = 0.3739
> t.test(c(0,0,1e-100), c(0,0,0),var.equal=TRUE)
Two Sample t-test
data: c(0, 0, 1e-100) and c(0, 0, 0)
t = 1, df = 4, p-value = 0.3739
> t.test(c(0,0,1e-100), c(0,0,1e-50),var.equal=TRUE)
Two Sample t-test
data: c(0, 0, 1e-100) and c(0, 0, 1e-50)
t = -1, df = 4, p-value = 0.3739
> t.test(c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,1e-100), c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,1e-100),var.equal=TRUE)
Two Sample t-test
data: c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1e-100) and c(0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1e-100)
t = 0, df = 28, p-value = 1
> t.test(c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,1e-50), c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,1e-100),var.equal=TRUE)
Two Sample t-test
data: c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1e-50) and c(0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1e-100)
t = 1, df = 28, p-value = 0.3259
> t.test(c(0,0,1e-100), c(0,0,1e-100),var.equal=TRUE)
Two Sample t-test
data: c(0, 0, 1e-100) and c(0, 0, 1e-100)
t = 0, df = 4, p-value = 1
exactly the same
> t.test(c(0,0,0), c(0,0,0),var.equal=TRUE)
Two Sample t-test
data: c(0, 0, 0) and c(0, 0, 0)
t = NaN, df = 4, p-value = NA
If we don't return NaN, then I'm still in favor of the 0/0=1 solution.
Josef
More information about the SciPy-User
mailing list