[Numpy-discussion] Multivariate hypergeometric distribution?
Mon Jul 2 19:08:47 CDT 2012
On Mon, Jul 2, 2012 at 4:16 PM, Fernando Perez <email@example.com> wrote:
> Hi all,
> in recent work with a colleague, the need came up for a multivariate
> hypergeometric sampler; I had a look in the numpy code and saw we have
> the bivariate version, but not the multivariate one.
> I had a look at the code in scipy.stats.distributions, and it doesn't
> look too difficult to add a proper multivariate hypergeometric by
> extending the bivariate code, with one important caveat: the hard part
> is the implementation of the actual discrete hypergeometric sampler,
> which lives inside of numpy/random/mtrand/distributions.c:
> That code is hand-written C, and it only works for the bivariate case
> right now. It doesn't look terribly difficult to extend, but it will
> certainly take a bit of care and testing to ensure all edge cases are
> handled correctly.
My only foray into this
This looks difficult to add without a good reference and clear
description of the algorithm.
> Does anyone happen to have that implemented lying around, in a form
> that would be easy to merge to add this capability to numpy?
not me, I have never even heard of multivariate hypergeometric distribution.
maybe http://hal.inria.fr/docs/00/11/00/56/PDF/perm.pdf p.11
with some properties http://www.math.uah.edu/stat/urn/MultiHypergeometric.html
I've seen one other algorithm, that seems to need N (number of draws
in hypergeom) random variables for one multivariate hypergeometric
random draw, which seems slow to me.
But maybe someone has it lying around.
> NumPy-Discussion mailing list
More information about the NumPy-Discussion