[SciPy-dev] Inclusion of Kuiper test in Scipy

Anne Archibald peridot.faceted@gmail....
Mon Nov 2 13:01:18 CST 2009


2009/11/2 Jake VanderPlas <jakevdp@gmail.com>:
> Anne,
> I also recently required a Kuiper test code for my research.  I
> adapted an IDL routine for python.  I'd say it is definitely worth
> including.  In addition to what you listed, a routine to calculate the
> significance of the Kuiper value would be useful.  I have a python
> version of that code if you'd like to see it.

Actually, my code has significance calculators for all three tests
(based on Paltani 2004). But I know that the value is somewhat off for
small N - perhaps you could send me yours and I could see if it does
better?

Anne

>   -Jake
>
> On Mon, Nov 2, 2009 at 6:50 AM, Anne Archibald
> <aarchiba@physics.mcgill.ca> wrote:
>> Hi,
>>
>> I have implemented a statistical test from the literature, the Kuiper
>> test, for my own work, but I think it might be worth including it in
>> Scipy itself. I'd like to hear other people's opinions, though, both
>> on what (if anything) should go into scipy, and on whether it needs
>> modification. The code is at:
>>
>> http://github.com/aarchiba/kuiper
>>
>> This code includes a number of things beyond the basic test, some or
>> all of which may not be worth including in Scipy. What's there:
>>
>> The Kuiper test - analogous to the Kolmogorov-Smirnov test, this takes
>> either a sample and a callable CDF or two samples and returns an
>> abstract score and the probability that a score that large would have
>> arisen if the two arguments are from the same distribution. This test
>> is sensitive to somewhat different features of the distribution than
>> the K-S test, and, importantly, it is invariant under cyclic
>> permutation: that is, if all the samples and distribution are modulo
>> (say) 1, then any shift in both arguments leaves the value unaffected.
>> Thus it is well suited to periodic distributions.
>>
>> The Z_m^2 test - a test for uniformity on [0,1) based on the first m
>> Fourier coefficients. Returns a score and the probability of a score
>> that large.
>>
>> The H test - a test that uses a data-dependent number of harmonics to
>> test for uniformity. Returns the score and the probability, and also
>> the number of harmonics that gave the most significant detection.
>>
>> fold_intervals - a function to take a series of weighted intervals and
>> return the total exposure of each phase modulo 1. For testing for
>> uniformity when you have more data from some phases than others.
>> cdf_from_intervals - a function to construct a piecewise-linear CDF
>> from a set of exposures (as returned by the above function).
>> histogram_intervals - A function to evaluate how much exposure each
>> histogram bin received, to allow testing for uniformity using a
>> histogram in the presence of non-uniform exposure.
>>
>> There are also a couple of handy decorators in the test suite:
>>
>> seed - set the random seed before running a test
>> double_check - for randomized tests: run once, and if it fails, run it again.
>>
>> All have tests and somewhat informative docstrings, but I suspect some
>> of them may be too specialized to be of much use. The Kuiper test
>> should have wide applicability; the Z_m^2 test and H test, not so
>> much, although they are handy when testinf gor periodicity. The last
>> batch of utility functions I'm not sure are general enough to be very
>> useful, but I needed them.
>>
>> What do you think? How much of this would be useful in Scipy?
>>
>> Thanks,
>> Anne
>> _______________________________________________
>> Scipy-dev mailing list
>> Scipy-dev@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> Scipy-dev mailing list
> Scipy-dev@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>


More information about the Scipy-dev mailing list