[SciPy-Dev] chi-square test for a contingency (R x C) table
Tue Jun 1 23:28:35 CDT 2010
I've been digging into some basic statistics recently, and developed the
following function for applying the chi-square test to a contingency
table. Does something like this already exist in scipy.stats? If not,
any objects to adding it? (Tests are already written :)
"""Chi-square calculation for a contingency (R x C) table.
This function computes the chi-square statistic and p-value of the
data in the table. The expected frequencies are computed based on
the relative frequencies in the table.
table : array_like, 2D
The contingency table, also known as the R x C table.
chisquare statistic : float
The chisquare test statistic
p : float
The p-value of the test.
table = np.asarray(table)
if table.ndim != 2:
raise ValueError("table must be a 2D array.")
# Create the table of expected frequencies.
total = table.sum()
row_sum = table.sum(axis=1).reshape(-1,1)
col_sum = table.sum(axis=0)
expected = row_sum * col_sum / float(total)
# Since we are passing in 1D arrays of length table.size, the default
# number of degrees of freedom is table.size-1.
# For a contingency table, the actual number degrees of freedom is
# (nr - 1)*(nc-1). We use the ddof argument
# of the chisquare function to adjust the default.
nr, nc = table.shape
dof = (nr - 1) * (nc - 1)
dof_adjust = (table.size - 1) - dof
chi2, p = chisquare(np.ravel(table), np.ravel(expected),
return chi2, p
More information about the SciPy-Dev