[SciPy-dev] stats - kstest

Manuel Metz mmetz at astro.uni-bonn.de
Tue Jul 20 04:32:05 CDT 2004

```Travis Oliphant wrote:
> I reviewed what was done again and now believe we were correct.  The
> distribution that is being used in kstest is the Kolmogorov one-sided
> distribution, KS+   Because this is the distribution used, the test is
> done with a one-sided statistic.
>
> SciPy only has an approximate two-sided statistic which is valid for
> large N.  We do not have it wrapped in a kstest-like command, but the
> distribution is available as kstwobign.
> We could modify kstest or make a new command for the two-sided test.
>

Hm, my first suggestion is to make the notation clear(er): Many people
know and use the "Numerical recipes" (NR). The notation there is: ksone
= two-sided statistic; kstwo = two-sided statistic for a
2D-distribution. So this may lead to some confusion...

The algorithm of the SciPy distribution 'kstwobign' is the same as given
in the NR (there 'probks'). They say that the approximation is good for
N>4. Maybe it would be a good idea to implement the two-sided test with
a new name, like 'kstest2side' or 'kstest2s and for clarity change the
doc-string of kstest to make clear, that this is the D+ test.

However, I found 2 paper that provide more accurate solutions to the
two-sided test (as I understood it):

"Computing the Cumulative Distribution Function of the
Kolmogornov-Smirnov Statistic" by Drew, Glen & Leemis;
http://www.math.wm.edu/~leemis/

and

"Evaluating Kolmogorov's Distribution" by Marsaglia, Tsang & Wang;
http://www.jstatsoft.org/v08/i18/k.ps

In the Introduction of the second paper they say:
"We provide here a relatively small C procedure, K(n,d), that will
provide Pr(D_n<d) with far greater precision than is needed in practice."
This may be a good candidate to be used in SciPy...

Manuel

```