[Numpy-discussion] categorical distributions
josef.pktd@gmai...
josef.pktd@gmai...
Mon Nov 22 07:46:10 CST 2010
On Mon, Nov 22, 2010 at 6:05 AM, Hagen Fürstenau <hagen@zhuliguan.net> wrote:
>> ISTM that this elementary functionality deserves an implementation
>> that's as fast as it can be.
>
> To substantiate this, I just wrote a simple implementation of
> "categorical" in "numpy/random/mtrand.pyx" and it's more than 8x faster
> than your version for multiple samples of the same distribution and more
> than 3x faster than using "multinomial(1, ...)" for multiple samples of
> different distributions (each time tested with 1000 samples drawn from
> distributions over 1000 categories).
>
> I can provide it as a patch if there's any interest.
Can you compare the speed of your cython solution with the version of Chuck
--
For instance, weight 0..3 by 1..4, then
In [14]: w = arange(1,5)
In [15]: p = cumsum(w)/float(w.sum())
In [16]: bincount(p.searchsorted(random(1000000)))/1e6
Out[16]: array([ 0.100336, 0.200382, 0.299132, 0.40015 ])
-------------
from numpy mailing list thread "Weighted random integers", sep. 10
Using searchsorted hat roughly a 10 times speedup compared to my
multinomial version
Josef
>
> - Hagen
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
More information about the NumPy-Discussion
mailing list