[SciPy-dev] stats - kstest

Robert Kern rkern at ucsd.edu
Fri Jul 16 11:21:19 CDT 2004


Manuel Metz wrote:

> Hi,
> hopefully I'm at the right place to manifest my suggestion.
> 
> As far as I understand the "kstest" from the book "Numerical recipes in 
> C++" (Chapt. 14.3, Kolmogorov-Smirnov Test) the kstest algorithm is not 
> correctly implementet in SciPy. (or NR ?) I think the error is in the 
> second last line of kstest():
> 
>  >>> D = max(abs(cdfvals - sb.arange(1.0,N+1)/N))
> 
> In comparison from NR:
> 
>  >>> double en = data.size()
>  >>> for( j=0; j<n; j++) {
>  >>>     fn = (j+1)/en;
>  >>>     ff = func( data[j] );
>  >>>     dt = max( fabs(fo-ff), fabs(fn-ff)
>  >>>     if (dt > d) d=dt;
>  >>>     fo = fn;
>  >>> }
> 
> So the main difference is, that in the NR algorithm the "D" is 
> calculated as the maximum distance D = max |S_N(x) - P(x)| by 
> calculating the distances to the upper AND the lower side of P(X) to the 
> step function S_N(x), while in the SciPy routine only the distance to 
> the upper side is calculated.
> 
> Is my suggestion right, that the error is in the SciPy algorithm? If 
> yes, could anyone correct it with the next release of SciPy?

Yes, I believe you are correct.

> Manuel

-- 
Robert Kern
rkern at ucsd.edu

"In the fields of hell where the grass grows high
  Are the graves of dreams allowed to die."
   -- Richard Harter




More information about the Scipy-dev mailing list