[SciPy-Dev] chi-square test for a contingency (R x C) table
Wed Jun 2 13:18:01 CDT 2010
On 2010-06-02 13:10 , Bruce Southey wrote:
>>> However, this code is the chi-squared test part as SAS will compute the
>>> actual cell numbers. Also an extension to scipy.stats.chisquare() so we
>>> can not have both functions.
>> Again, I don't understand what you mean that we can't have both
>> functions? I believe (from a statistics teacher's point of view) that
>> the Chi-Squared goodness of fit test (which is stats.chisquare) is a
>> different beast from the Chi-Square test for independence (which is
>> stats.chisquare_contingency). The fact that the distribution of the
>> test statistic is the same should not tempt us to put them into the
>> same function.
> Please read scipy.stats.chisquare() because scipy.stats.chisquare() is
> the 1-d case of yours.
> Quote from the docstring:
> " The chi square test tests the null hypothesis that the categorical data
> has the given frequencies."
> Also go the web site provided in the docstring.
> By default you get the expected frequencies but you can also put in your
> own using the f_exp variable. You could do the same in your code.
In fact, Warren correctly used stats.chisquare with the expected
frequencies calculated from the null hypothesis and the corrected
degrees of freedom. chisquare_contingency is in some sense a
convenience method for taking care of these pre-calculations before
calling stats.chisquare. Can you explain more clearly to me why we
should not include such a convenience function?
>>> Really this should be combined with fisher.py in ticket 956:
>> Wow, apparently I have lots of disagreements today, but I don't think
>> that this should be combined with Fisher's Exact test. (I would like
>> to see that ticket mature to the point where it can be added to
>> scipy.stats.) I like the functions in scipy.stats to correspond in a
>> one-to-one manner with the statistical tests. I think that the docs
>> should "See Also" the appropriate exact (and non-parametric) tests,
>> but I think that one function/one test is a good rule. This is
>> particularly true for people (like me) who would like to someday be
>> able to use scipy.stats in a pedagogical context.
> I don't see any 'disagreements' rather just different ways to do things
> and identifying areas that need to be addressed for more general use.
More information about the SciPy-Dev