[SciPy-User] issue estimating the multivariate Polya distribution: why?

josef.pktd@gmai... josef.pktd@gmai...
Sat Mar 3 17:24:32 CST 2012

```On Sat, Mar 3, 2012 at 4:05 PM, Emanuele Olivetti
<emanuele@relativita.com> wrote:
> Dear All,
>
> I am playing with the multivariate Polya distribution - also
> known as the Dirichlet compound multinomial distribution. A
> brief description from wikipedia:
>   http://en.wikipedia.org/wiki/Multivariate_P%C3%B3lya_distribution
>
> I made a straightforward implementation of the probability
> density function in the log-scale here:
>   https://gist.github.com/1968113
> together with a straightforward montecarlo estimation (by
> sampling first from a Dirichlet and then computing the log-likelihood
> of the multinomial) in the log-scale as well. The log-scale was
> chosen in order to improve numerical stability.
>
> If you run the code liked above you should get these two examples:
> ----
> X: [ 0 50 50]
> alpha: [ 1 10 10]
> analytic: -5.22892710577
> montecarlo -5.23470053651
>
> X: [100   0   0]
> alpha: [ 1 10 10]
> analytic: -51.737395965
> montecarlo -93.5266543113
> ----
>
> As you can see in the first case, i.e. X=[0,50,50], there is excellent
> agreement between the two implementations while in the second case, i.e.
> x=[100,0,0], there is a dramatic disagreement. Note that the montecarlo
> estimate is quite stable and if you change the seed of the random
> number generator you get numbers not too far from -90.
>
> So my question is: where does this issue come from? I cannot see
> mistakes in the implementation (the code is very simple) and
> I cannot see the source of numerical instability.

I don't see anything. I was trying out several different parameters,
and my only guess is that
the logaddexp is not precise enough in this case. My results (numpy
1.5.1) are even worse.

The probability that you want to calculate is very low

>>> np.exp(-51.737395965)
3.3941765165211696e-23

For larger values it seems to work fine, but it deteriorates fast when
the loglikelihood drops below -15 or so (with the versions I have
installed).

Do you need almost zero probability events?

In a similar problem with poisson mixtures but without monte carlo, I
was trying out various ways of rescaling, but didn't come up with
anything useful.

Josef

>
> Any hint?
>
> Best,
>
> Emanuele
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
```