# [SciPy-User] [SciPy-user] Unit testing of Bayesian estimator

Anne Archibald peridot.faceted@gmail....
Sun Nov 8 01:25:04 CST 2009

```2009/11/7 Tom K. <tpk@kraussfamily.org>:
>
> Hi Anne, interesting question.
>
> I'm not really sure what a Bayesian hypothesis tester is, but I expect that
> it results in a random variable.  For a given input prior and measurement
> distribution, and choice of hypothesis (signal present or signal absent),
> can you know the distribution of this random variable?  If so, it could come
> down to a test that this random variable - or a function of it such as mean
> or probability that it is greater than some value - behaves as expected.
> How do you create a unit-test that a random variable generator is working?
> If the random variables were all iid normal, you could average a bunch and
> then test that the mean of the sample was close to the mean of the
> distribution - which is going to be impossible to guarantee, since there is
> a non-zero probability that the mean is large.  In practice however it is
> likely that your test will pass, but you will probably have to tack down the
> seeds and make sure that the probability of failure is really small so
> changing seeds (and hence the underlying sequence of "random" inputs) won't
> likely cause a failure.
>
> Is there anything in scipy's stats module to test that a series of random
> variables has a given distribution?  Maybe scipy.stats.kstest?  Who the heck
> are Kolmogorov and Smirnov anyway :-)?

Focusing on the hypothesis testing part, what my code does is take a
collection of photons and return the probability that they are drawn
from a pulsed distribution. My prior has two alternatives: not pulsed
(p=0.5) and pulsed (p=0.5, parameters randomly chosen). I feed this to
a Bayesian gizmo and get back a probability that the photons were
drawn from the former case.

In terms of testing, the very crude tests pass: if I give it a zillion
photons, it can correctly distinguish pulsed from unpulsed. But what
I'd like to test is whether the probability it returns is correct.
What I'd really like is some statistical test I can do on the
procedure to check whether the returned numbers are correct. Of
course, if I knew what distribution they were supposed to have, I
could just feed them to the K-S test. But I don't.

Part of the problem is that the data quality affects the result: if
feed in a zillion unpulsed photons, I get a variety of probabilities
that are close to zero - but how close is correct? I have no idea. If
I use pulsed photons, it is even more complicated: for a large pulsed
fraction, I'll get a variety of probabilities that are close to one.
But if I either reduce the number of photons or the pulsed fraction,
it gets harder to distinguish pulsed from unpulsed and the
probabilities start to drop. But I have no real idea how much.

Anne

> Anne Archibald-2 wrote:
>>
>> Hi,
>>
>> I have implemented a simple Bayesian regression program (it takes
>> events modulo one and returns a posterior probability that the data is
>> phase-invariant plus a posterior distribution for two parameters
>> (modulation fraction and phase) in case there is modulation). I'm
>> rather new at this, so I'd like to construct some unit tests. Does
>> anyone have any suggestions on how to go about this?
>>
>> For a frequentist periodicity detector, the return value is a
>> probability that, given the null hypothesis is true, the statistic
>> would be this extreme. So I can construct a strong unit test by
>> generating a collection of data sets given the null hypothesis,
>> evaluating the statistic, and seeing whether the number that claim to
>> be significant at a 5% level is really 5%. (In fact I can use the
>> binomial distribution to get limits on the number of false positive.)
>> This gives me a unit test that is completely orthogonal to my
>> implementation, and that passes if and only if the code works. For a
>> Bayesian hypothesis testing setup, I don't really see how to do
>> something analogous.
>>
>> I can generate non-modulated data sets and confirm that my code
>> returns a high probability that the data is not modulated, but how
>> high should I expect the probability to be? I can generate data sets
>> with models with known parameters and check that the best-fit
>> parameters are close to the known parameters - but how close? Even if
>> I do it many times, is the posterior mean unbiased? What about the
>> posterior mode or median? I can even generate models and then data
>> sets that are drawn from the prior distribution, but what should I
>> expect from the code output on such a data set? I feel sure there's
>> some test that verifies a statistical property of Bayesian
>> estimators/hypothesis testers, but I cant quite put my finger on it.
>>
>> Suggestions welcome.
>>
>> Thanks,
>> Anne
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>
> --
> View this message in context: http://old.nabble.com/Unit-testing-of-Bayesian-estimator-tp26240654p26241135.html
> Sent from the Scipy-User mailing list archive at Nabble.com.
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
```