[SciPy-dev] kstest is reporting wrong p-value ??
Thu Nov 27 11:30:24 CST 2008
I compared with R in more detail:
conclusion for small samples:
* stats.kstest() for less than 10 observation is pretty wrong
* calculation of D differs quite a bit from R and matlab (those 2 give
the same numbers)
* exact method in R uses the same distribution as
stats.ksone.sf(D,n)*2 up to 4 decimals ! Note: times 2
* asymptotic distribution in R (not using exact) is exactly the same
as kstwobign.(D*sqrt(n)) up to more than 7 decimals
For larger samples, I tried 100 normal distributed random variables
stats.kstest() still gives the wrong D and pval, but the difference is
not as large as in small samples.
With a sample of 1000 normal rvs, the D of stats.kstest() and of R are
essentially identical, but the pvalue reported by stats.kstest() is
half of the one in R
>>> xxrl = stats.norm.rvs(size=1000)
>>> resultrl=ksfn(xxrl,'pnorm', exact = True) #this is R's kstest through rpy
So, stats.kstest() definitely needs to be fixed.
More information about the Scipy-dev