[SciPy-Dev] Resolving PR 235: t-statistic = 0/0 case

Junkshops junkshops@gmail....
Wed Jun 6 19:50:49 CDT 2012


OK, I give! NaN it is.

That being said:

Skipper said:
> This doesn't seem to be of all that much practical importance. In what
> situation do you expect this to really matter?
Eh, you're probably right. I tend to enjoy arguing back and forth (as 
long as it doesn't get heated) and sometimes pick pointless battles. 
Plus sometimes you learn a lot, and I'm not much of a statistician, so 
there's lots of opportunities for such.

If you're pulling data from a discrete distribution it could happen 
though (unless I'm mistaken).

Nathan said:
> Well, no, it isn't possible really -- taking n IID samples from a
> normal distribution and getting exactly the same number twice is an
> event that has probability zero.
Would you mind humoring me and explaining why this is true? It seems 
counter intuitive that getting the same sample twice from independent 
random draws is impossible.

OK, so what next? Shall I make the changes and push again? Or should we 
wait a bit and see if anyone else weighs in?

If a push is warranted the other issue is the style of the 4 t-tests (1 
sample, paired, 2 sample equal variances, 2 sample unequal variances):

A. 4 separate functions (as in the PR)
B. 1 combined function, select test via keyword arg, keep old function 
stubs for backward compatibility
C. Functions for 1 sample, paired, 2 sample with keyword selection of 
equal vs unequal variances.

I don't have strong feelings either way, but I think C is a little weird 
- should be all or none IMO. We could also go with A for now and change 
to B after the release; I think it's more important that the 
functionality gets in than consolidation of functions.

Cheers, Gavin



More information about the SciPy-Dev mailing list