[SciPy-Dev] chi-square test for a contingency (R x C) table

Bruce Southey bsouthey@gmail....
Fri Jun 4 12:08:12 CDT 2010


On 06/03/2010 08:27 AM, Warren Weckesser wrote:
> Just letting you know that I'm not ignoring all the great comments from
> josef, Neil and Bruce about my suggestion for chisquare_contingency.
> Unfortunately, I won't have time to think about all the deeper
> suggestions for another week or so.   For now, I'll just say that I
> agree with josef's and Neil's suggestions for the docstring, and that
> Neil's summary of the function as simply a convenience function that
> calls stats.chisquare with appropriate arguments to perform a test of
> independence on a contingency table is exactly what I had in mind.
>
> Warren
>
>
>    
Hi,
I looked at how SAS handles n-way tables. What it appears to do is break 
the original table down into a set of 2-way tables and does the analysis 
on each of these. So a 3 by 4 by 5 table is processed as three 2-way 
tables with the results of each 4 by 5 table presented. I do not know 
how Stata and R analysis analyze n-way tables.

Consequently, I rewrote my suggested code (attached) to handle 3 and 4 
way tables by using recursion. There should be some Python way to do 
that recursion for any number of dimensions. I also added the 1-way 
table (but that has a different hypothesis than the 2-way table) so 
users can send a 1-d table.

The data used is from two SAS examples and I added a dimension to get a 
4-way table. I included the SAS values but these are only to 4 decimal 
places for reference.

http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#/documentation/cdl/en/procstat/63104/HTML/default/procstat_freq_sect029.htm
http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#/documentation/cdl/en/procstat/63104/HTML/default/procstat_freq_sect030.htm 


What is missing:
1) Docstring and tests but those are dependent what is ultimately decided
2) Other test statistics but scipy.stats versions are not very friendly 
in that these do not accept a 2-d array
3) A way to do recursion
4) Ability to label the levels etc.
5) Correct handling of input types.

Bruce
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cont_table.py
Type: text/x-python
Size: 4300 bytes
Desc: not available
Url : http://mail.scipy.org/pipermail/scipy-dev/attachments/20100604/25d1e459/attachment.py 


More information about the SciPy-Dev mailing list