# [Scipy-tickets] [SciPy] #1583: mstats.chisquare and stats.chisquare documentation is out of date

SciPy Trac scipy-tickets@scipy....
Thu Jan 12 10:58:37 CST 2012

```#1583: mstats.chisquare and stats.chisquare documentation is out of date
-------------------------+--------------------------------------------------
Reporter:  dloewenherz  |       Owner:  somebody
Type:  defect       |      Status:  new
Priority:  normal       |   Milestone:  Unscheduled
Component:  scipy.stats  |     Version:  devel
Keywords:               |
-------------------------+--------------------------------------------------

Comment(by warren.weckesser):

> See
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mstats.chisquare.html
and
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chisquare.html
>
> It indicates you can specify `ddof`, the degrees of freedom, for the
p-value of the chi square test. For stats.chisquare, ddof seems to have no
affect.

Take another look at the docstring.  `ddof` is the *adjustment* to the
default degrees of freedom, which is the number of observations minus 1
(written `k-1` in the docstring, but unfortunately without mentioning what
`k` is).

This is the source code for stats.chisquare:
{{{
f_obs = asarray(f_obs)
k = len(f_obs)
if f_exp is None:
f_exp = array([np.sum(f_obs,axis=0)/float(k)] * len(f_obs),float)
f_exp = f_exp.astype(float)
return chisq, chisqprob(chisq, k-1-ddof)
}}}
This shows that if you pass in a two-dimensional array, each column is
treated as a separate set of observations.  For example:
{{{
In [7]: obs1 = [1,2,3]

In [8]: obs2 = [4,5,4]

In [9]: chisquare(obs1)
Out[9]: (1.0, 0.60653065971263342)

In [10]: chisquare(obs2)
Out[10]: (0.15384615384615388, 0.92596107864231603)

In [11]: m = array([obs1,obs2]).T

In [12]: m
Out[12]:
array([[1, 4],
[2, 5],
[3, 4]])

In [13]: chisquare(m)
Out[13]: (array([ 1.        ,  0.15384615]), array([ 0.60653066,
0.92596108]))

}}}
Note that the values in `chisquare(m)` are the same as those of
`chisquare(obs1)` and `chisquare(obs2)`.

When given an n-dimensional array, it treats each one-dimensional slice of
the first dimensional as a separate set of observations.  E.g. if you give
an array of shape (3,4,5), you'll get back two arrays of shape (4,5).

This might seem like a feature, but since it is not documented, it could
be considered a bug.

What would be better is to document this feature, and also add an `axis`
keyword to the function, so you can choose the axis along which the
calculation is performed.

--
Ticket URL: <http://projects.scipy.org/scipy/ticket/1583#comment:5>
SciPy <http://www.scipy.org>
SciPy is open-source software for mathematics, science, and engineering.
```