[SciPy-user] Some help with chisquare

Bruce Southey bsouthey@gmail....
Mon Oct 27 16:12:21 CDT 2008


Robert Kern wrote:
> On Wed, Oct 22, 2008 at 12:55, Erik Wickstrom <erik@erikwickstrom.com> wrote:
>   
>> Hi,
>>
>> I'm trying to port an application to python, and want to use scipy to handle
>> the statistics.
>>
>> The app takes several tests and uses chi-square to determines which has the
>> highest success rate with a confidence of 95% or better (critical
>> values/degrees of freedom).
>>
>> For example:
>> Test a:
>> Total trials = 100
>> Total successes = 60
>>
>> Test b:
>> Total trials = 105
>> Total successes = 46
>>
>> Test c:
>> Total trials = 98
>> Total successes = 52
>>
>> It then puts the data through some sort of chi-square formula (or so the
>> comments say) and produces a chi-square value that can be compared against
>> the critical values for 95% confidence.
>>
>> Trouble is, I'm not sure which of the many scipy chi-square functions to
>> use, and what data I need to feed into them....
>>     
>
> scipy.stats.chisquare() is probably what you want. Pass it arrays of
> the actual and expected frequencies for each Test. It will return to
> you a Chi^2 value and the associated p-value. If the p-value is <
> 0.05, then the Chi^2 value is greater than the critical value for the
> 95% confidence region.
>
>   

I think there is insufficient information here because the description 
is rather unclear. I think this sounds like the Cochran-Mantel-Haenszel 
test (http://en.wikipedia.org/wiki/Cochran_test). A formula to calculate 
the chi-square value and degrees of freedom would be clearer as well as 
the actual value and p-value returned for the above example.

Bruce




More information about the SciPy-user mailing list