[SciPy-User] Unit testing of Bayesian estimator

Anne Archibald aarchiba@physics.mcgill...
Fri Nov 6 18:13:20 CST 2009


I have implemented a simple Bayesian regression program (it takes
events modulo one and returns a posterior probability that the data is
phase-invariant plus a posterior distribution for two parameters
(modulation fraction and phase) in case there is modulation). I'm
rather new at this, so I'd like to construct some unit tests. Does
anyone have any suggestions on how to go about this?

For a frequentist periodicity detector, the return value is a
probability that, given the null hypothesis is true, the statistic
would be this extreme. So I can construct a strong unit test by
generating a collection of data sets given the null hypothesis,
evaluating the statistic, and seeing whether the number that claim to
be significant at a 5% level is really 5%. (In fact I can use the
binomial distribution to get limits on the number of false positive.)
This gives me a unit test that is completely orthogonal to my
implementation, and that passes if and only if the code works. For a
Bayesian hypothesis testing setup, I don't really see how to do
something analogous.

I can generate non-modulated data sets and confirm that my code
returns a high probability that the data is not modulated, but how
high should I expect the probability to be? I can generate data sets
with models with known parameters and check that the best-fit
parameters are close to the known parameters - but how close? Even if
I do it many times, is the posterior mean unbiased? What about the
posterior mode or median? I can even generate models and then data
sets that are drawn from the prior distribution, but what should I
expect from the code output on such a data set? I feel sure there's
some test that verifies a statistical property of Bayesian
estimators/hypothesis testers, but I cant quite put my finger on it.

Suggestions welcome.


More information about the SciPy-User mailing list