[SciPy-Dev] cephes_smirnov never returns on mips/sparc/...

Yaroslav Halchenko lists@onerussian....
Sat Mar 31 10:15:16 CDT 2012


Probably you are right Josef -- especially since I am only distantly familiar
with KS test -- but lets keep the dialog open a bit longer ;) :

> But what's the point in fitting ksone?

for me it was just that it has .fit() ;)    You might recall (I believe I
appeared on the list long ago with similar whining and that is how we got
introduced to each other) our evil/silly function in PyMVPA
match_distributions which simply tries to choose the best matching distribution
given the data -- that is the reason how ksone got involved

> > if starting values are the most sensible -- then yeap -- them ;)
> > if I ask to 'fit' something, getting some fit is better than getting no
> > fit (as NaNs in output suggest)

> getting the starting values back doesn't mean that you have "some" fit.

> If my brief playing with it today is correct, then the starting values
> don't make sense, for example you have points outside of the support
> of the distribution with estimated parameters (if you have negative
> values in the sample)

> NaN would be better, then at least you know it doesn't make sense.

1. to me the big question became: what ARE the logical values here?

followed docstring/example on
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ksone.html
-- got NaNs 

then given that 

In [44]: ksone.a, ksone.b
Out[44]: (0.0, inf)

I still failed to get any sensible fit() for positive values or even for
its own creation, e.g.

ss.ksone.fit(ss.ksone(5).rvs(size=100))

results in bulk of warnings and then (1.0, nan, nan).  

Looking in detail -- rvs is happily generating NaNs (especially for small n's).

b. Also the range of sensible values of the parameter n isn't specified
anywhere for KS test newbies like me, which I guess adds the confusion:

> support of the sample would help. I have no idea about good starting
> values for the shape parameter (n is sample size for kstest)

aga -- so the 'demo' value of 0.9 indeed makes no sense ;)  Might be
worth adjusting somehow?

2.

BTW -- trying to familiarize myself with the distribution plotted its
pdf, e.g.:

x = np.linspace(0, 3, 1000); plt.plot(x, ksone(10).pdf(x))

and it looks weirdish: http://www.onerussian.com/tmp/ksone-ns.png in that it is
not smooth and my algebra-forgotten eyes do not see obvious points with
no 2nd derivative of cdf given on
http://en.wikipedia.org/wiki/Kolmogorov_Smirnov

Also why ksone.b is inf -- shouldn't it be 1?

-- 
=------------------------------------------------------------------=
Keep in touch                                     www.onerussian.com
Yaroslav Halchenko                 www.ohloh.net/accounts/yarikoptic


More information about the SciPy-Dev mailing list