[SciPy-User] vectorized version of 'multinomial' sampling function
josef.pktd@gmai...
josef.pktd@gmai...
Wed Oct 14 00:41:22 CDT 2009
On Wed, Oct 14, 2009 at 1:07 AM, <josef.pktd@gmail.com> wrote:
> On Tue, Oct 13, 2009 at 5:01 PM, per freem <perfreem@gmail.com> wrote:
>> hi all,
>>
>> i have a series of probability vector that i'd like to feed into
>> multinomial to get an array of vector outcomes back. for example,
>> given:
>>
>> p = array([[ 0.9 , 0.05, 0.05],
>> [ 0.05, 0.05, 0.9 ]])
>>
>> i'd like to call multinomial like this:
>>
>> multinomial(1, p)
>
> If you only want n=1, then it seems to me, you could do this also with
> drawing an integer out of range(3) with the given probabilities. This
> can be vectorized easily directly with numpy. (just like an individual
> observation in multinomial logit)
>
> Josef
>
>>
>> to get a vector of multinomial samplers, each using the nth list in
>> 'p'. something like:
>>
>> array([[1, 0, 0], [0, 0 1]]) in this case. is this possible? it seems
>> like 'multinomial' takes only a one dimensional array. i could write
>> this as a "for" loop of course but i prefer a vectorized version since
>> speed is crucial for me here.
for n=1 the following should work (unless I still cannot tell
multinomial distribution and multinomial logit apart)
Josef
>>> p = np.array([[ 0.9 , 0.05, 0.05],
[ 0.05, 0.05, 0.9 ]])
>>> pcum = p.cumsum(1)
>>> pcuma = np.repeat(pcum,20, 0)
>>> rvs = np.random.uniform(size=(40))
>>> a,b,c = pcuma.T
>>> mnrvs = np.column_stack((rvs<=a, (rvs>a) & (rvs<=b), rvs>b)).astype(int)
>>> mnrvs[:20].sum()
20
>>> mnrvs[:20].sum(0)
array([19, 0, 1])
>>> mnrvs[20:].sum()
20
>>> mnrvs[20:].sum(0)
array([ 1, 0, 19])
>>> mnrvs[15:25]
array([[0, 0, 1],
[1, 0, 0],
[1, 0, 0],
[1, 0, 0],
[1, 0, 0],
[0, 0, 1],
[0, 0, 1],
[0, 0, 1],
[1, 0, 0],
[0, 0, 1]])
check for larger sample:
>>> pcuma = np.repeat(pcum,500, 0)
>>> rvs = np.random.uniform(size=(1000))
>>> a,b,c = pcuma.T
>>> mnrvs = np.column_stack((rvs<=a, (rvs>a) & (rvs<=b), rvs>b)).astype(int)
>>> mnrvs[:500].sum(0)
array([456, 17, 27])
>>> mnrvs[500:].sum(0)
array([ 24, 35, 441])
>>> p*500
array([[ 450., 25., 25.],
[ 25., 25., 450.]])
More information about the SciPy-User
mailing list