[Numpy-discussion] Multivariate hypergeometric distribution?

josef.pktd@gmai... josef.pktd@gmai...
Mon Jul 2 19:08:47 CDT 2012

On Mon, Jul 2, 2012 at 4:16 PM, Fernando Perez <fperez.net@gmail.com> wrote:
> Hi all,
> in recent work with a colleague, the need came up for a multivariate
> hypergeometric sampler; I had a look in the numpy code and saw we have
> the bivariate version, but not the multivariate one.
> I had a look at the code in scipy.stats.distributions, and it doesn't
> look too difficult to add a proper multivariate hypergeometric by
> extending the bivariate code, with one important caveat: the hard part
> is the implementation of the actual discrete hypergeometric sampler,
> which lives inside of numpy/random/mtrand/distributions.c:
> https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L743
> That code is hand-written C, and it only works for the bivariate case
> right now.  It doesn't look terribly difficult to extend, but it will
> certainly take a bit of care and testing to ensure all edge cases are
> handled correctly.

My only foray into this


This looks difficult to add without a good reference and clear
description of the algorithm.

> Does anyone happen to have that implemented lying around, in a form
> that would be easy to merge to add this capability to numpy?

not me, I have never even heard of multivariate hypergeometric distribution.

maybe http://hal.inria.fr/docs/00/11/00/56/PDF/perm.pdf  p.11
with some properties http://www.math.uah.edu/stat/urn/MultiHypergeometric.html

I've seen one other algorithm, that seems to need N (number of draws
in hypergeom) random variables for one multivariate hypergeometric
random draw, which seems slow to me.

But maybe someone has it lying around.


> Thanks,
> f
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

More information about the NumPy-Discussion mailing list