[Numpy-discussion] Multivariate hypergeometric distribution?

Fernando Perez fperez.net@gmail....
Mon Jul 2 15:16:26 CDT 2012

Hi all,

in recent work with a colleague, the need came up for a multivariate
hypergeometric sampler; I had a look in the numpy code and saw we have
the bivariate version, but not the multivariate one.

I had a look at the code in scipy.stats.distributions, and it doesn't
look too difficult to add a proper multivariate hypergeometric by
extending the bivariate code, with one important caveat: the hard part
is the implementation of the actual discrete hypergeometric sampler,
which lives inside of numpy/random/mtrand/distributions.c:


That code is hand-written C, and it only works for the bivariate case
right now.  It doesn't look terribly difficult to extend, but it will
certainly take a bit of care and testing to ensure all edge cases are
handled correctly.

Does anyone happen to have that implemented lying around, in a form
that would be easy to merge to add this capability to numpy?



