[SciPy-Dev] Resolving PR 235: t-statistic = 0/0 case

Angel Yanguas-Gil angel.yanguas@gmail....
Wed Jun 6 16:37:36 CDT 2012


Regarding the 0/0 case: if your variance is nominally zero (and you
are using your sample to estimate the variance), then essentially you
are saying that any infinitesimal deviation from that value is
statistically significative. From a mathematical perspective, in the
case of a normal distribution a variance that tends to zero while
keeping the same area tends to a dirac delta, which is no longer a
traditional mathematical function, but it is a generalized function or
a distribution:

So I guess the question is more: would you like to raise a flag when
you apply your function to two zero variance samples?

On Wed, Jun 6, 2012 at 4:18 PM, Junkshops <junkshops@gmail.com> wrote:
> Hi Nathaniel,
> At the outset, I'll just say that if the consensus is that we should
> return NaN, I'll accept that. I'll still try and argue my case though.
>> My R seems to throw an exception whenever the variance is zero
>> (regardless of the mean difference), not return NaN:
> Sorry, yes, that's correct.
>> Like any parametric test, the t-test only makes sense under some kind
>> of (at least approximate) assumptions about the data generating
>> process. When the sample variance is 0, then those assumptions are
>> clearly violated,
> So this seems similar to argument J2, and I still don't understand it.
> Let's say we assume our population data is normally distributed and we
> take three samples from the population and get [1,1,1]. How does that
> prove our assumption is incorrect? It's certainly possible to pull the
> same number three times from a normal distribution.
>> and it doesn't seem appropriate to me to start
>> making up numbers according to some other rule that we hope might give
>> some sort-of appropriate result ("In the face of ambiguity, refuse the
>> temptation to guess."). So I actually like the R/Matlab option of
>> throwing an exception or returning NaN.
> Well, we're not making up numbers here - we absolutely know the means
> are the same. Hence p  = 1 and t = 0.
> -g
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev

More information about the SciPy-Dev mailing list