# [SciPy-User] Unit testing of Bayesian estimator

josef.pktd@gmai... josef.pktd@gmai...
Fri Nov 6 21:37:44 CST 2009

```On Fri, Nov 6, 2009 at 7:13 PM, Anne Archibald
<aarchiba@physics.mcgill.ca> wrote:
> Hi,
>
> I have implemented a simple Bayesian regression program (it takes
> events modulo one and returns a posterior probability that the data is
> phase-invariant plus a posterior distribution for two parameters
> (modulation fraction and phase) in case there is modulation). I'm
> rather new at this, so I'd like to construct some unit tests. Does
>
> For a frequentist periodicity detector, the return value is a
> probability that, given the null hypothesis is true, the statistic
> would be this extreme. So I can construct a strong unit test by
> generating a collection of data sets given the null hypothesis,
> evaluating the statistic, and seeing whether the number that claim to
> be significant at a 5% level is really 5%. (In fact I can use the
> binomial distribution to get limits on the number of false positive.)
> This gives me a unit test that is completely orthogonal to my
> implementation, and that passes if and only if the code works. For a
> Bayesian hypothesis testing setup, I don't really see how to do
> something analogous.
>
> I can generate non-modulated data sets and confirm that my code
> returns a high probability that the data is not modulated, but how
> high should I expect the probability to be? I can generate data sets
> with models with known parameters and check that the best-fit
> parameters are close to the known parameters - but how close? Even if
> I do it many times, is the posterior mean unbiased? What about the
> posterior mode or median? I can even generate models and then data
> sets that are drawn from the prior distribution, but what should I
> expect from the code output on such a data set? I feel sure there's
> some test that verifies a statistical property of Bayesian
> estimators/hypothesis testers, but I cant quite put my finger on it.

The Bayesian experts are at pymc, maybe you can look at there tests
for inspiration. I don't know those, since I never looked at that
part.

I never tried to test a Bayesian estimator but many properties are
still the same as in the non-Bayesian analysis. In my Bayesian past, I
essentially only used normal and t distributions, and binomial.

One of my first tests for these things is to create a huge sample and
see whether the parameter estimates converge. With Bayesian analysis
you still have the law of large numbers, (for non-dogmatic priors)
Do you have an example with a known posterior? Then, the posterior
with a large sample or the average in a Monte Carlo should still be
approximately the true one.
For symmetric distributions, the Bayesian posterior confidence
intervals and posterior mean should be roughly the same as the
frequentist estimates. With diffuse priors, in many cases the results
are exactly the same in Bayesian and MLE.
Another version I used in the past is to trace the posterior mean, as
the prior variance is reduced, in one extreme you should get the prior
back in the other extreme the MLE.

> I can even generate models and then data
> sets that are drawn from the prior distribution, but what should I
> expect from the code output on such a data set?

If you ignore the Bayesian interpretation, then this is just a
standard sampling problem, you draw prior parameters and observations,
the rest is just finding the conditional and marginal probabilities. I
think the posterior odds ratio should converge in a large Monte Carlo
to the true one, and the significance levels should correspond to the
one that has been set for the test (5%).
(In simplest case of conjugate priors, you can just interpret the
prior as a previous sample and you are back to a frequentist
explanation.)

The problem is that with an informative prior, you always have a
biased estimator in small samples and the posterior odds ratio is
affected by an informative prior.  And "real" Bayesians don't care

What are your prior distributions and the likelihood function in your
case? Can you model degenerate and diffuse priors, so that an
informative prior doesn't influence you sampling results?
I'm trying to think of special cases where you could remove the effect
of the prior.

It's a bit vague because I don't see the details, and I haven't looked
at this in a while.

>
> Suggestions welcome.
>
> Thanks,
> Anne
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
```