[Numpy-discussion] vectorizing
josef.pktd@gmai...
josef.pktd@gmai...
Fri Jun 5 15:06:40 CDT 2009
On Fri, Jun 5, 2009 at 3:53 PM, <josef.pktd@gmail.com> wrote:
> On Fri, Jun 5, 2009 at 2:07 PM, Brian Blais <bblais@bryant.edu> wrote:
>> Hello,
>> I have a vectorizing problem that I don't see an obvious way to solve. What
>> I have is a vector like:
>> obs=array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2])
>> and a matrix
>> T=zeros((6,6))
>> and what I want in T is a count of all of the transitions in obs, e.g.
>> T[1,2]=3 because the sequence 1-2 happens 3 times, T[3,4]=1 because the
>> sequence 3-4 only happens once, etc... I can do it unvectorized like:
>> for o1,o2 in zip(obs[:-1],obs[1:]):
>> T[o1,o2]+=1
>>
>> which gives the correct answer from above, which is:
>> array([[ 0., 0., 0., 0., 0., 0.],
>> [ 0., 0., 3., 0., 0., 1.],
>> [ 0., 3., 0., 1., 0., 0.],
>> [ 0., 0., 2., 0., 1., 0.],
>> [ 0., 0., 0., 2., 0., 0.],
>> [ 0., 0., 0., 0., 1., 0.]])
>>
>>
>> but I thought there would be a better way. I tried:
>> o1=obs[:-1]
>> o2=obs[1:]
>> T[o1,o2]+=1
>> but this doesn't give a count, it just yields 1's at the transition points,
>> like:
>> array([[ 0., 0., 0., 0., 0., 0.],
>> [ 0., 0., 1., 0., 0., 1.],
>> [ 0., 1., 0., 1., 0., 0.],
>> [ 0., 0., 1., 0., 1., 0.],
>> [ 0., 0., 0., 1., 0., 0.],
>> [ 0., 0., 0., 0., 1., 0.]])
>>
>> Is there a clever way to do this? I could write a quick Cython solution,
>> but I wanted to keep this as an all-numpy implementation if I can.
>>
>
> histogram2d or its imitation, there was a discussion on histogram2d a
> short time ago
>
>>>> obs=np.array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2])
>>>> obs2 = obs - 1
>>>> trans = np.hstack((0,np.bincount(obs2[:-1]*6+6+obs2[1:]),0)).reshape(6,6)
I don't think this is correct, there are problems for the first and
last position if (1,1) and (5,5) are possible. I don't remember how I
did it before. The histogram2d version still looks correct for this
case. histogram2d uses bincount under the hood in a similar way.
Josef
>>>> re = np.array([[ 0., 0., 0., 0., 0., 0.],
> ... [ 0., 0., 3., 0., 0., 1.],
> ... [ 0., 3., 0., 1., 0., 0.],
> ... [ 0., 0., 2., 0., 1., 0.],
> ... [ 0., 0., 0., 2., 0., 0.],
> ... [ 0., 0., 0., 0., 1., 0.]])
>>>> np.all(re == trans)
> True
>
>>>> trans
> array([[0, 0, 0, 0, 0, 0],
> [0, 0, 3, 0, 0, 1],
> [0, 3, 0, 1, 0, 0],
> [0, 0, 2, 0, 1, 0],
> [0, 0, 0, 2, 0, 0],
> [0, 0, 0, 0, 1, 0]])
>
>
> or
>
>>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]])
>>>> re
> array([[ 0., 0., 0., 0., 0., 0.],
> [ 0., 0., 3., 0., 0., 1.],
> [ 0., 3., 0., 1., 0., 0.],
> [ 0., 0., 2., 0., 1., 0.],
> [ 0., 0., 0., 2., 0., 0.],
> [ 0., 0., 0., 0., 1., 0.]])
>>>> h
> array([[ 0., 0., 0., 0., 0., 0.],
> [ 0., 0., 3., 0., 0., 1.],
> [ 0., 3., 0., 1., 0., 0.],
> [ 0., 0., 2., 0., 1., 0.],
> [ 0., 0., 0., 2., 0., 0.],
> [ 0., 0., 0., 0., 1., 0.]])
>
>>>> np.all(re == h)
> True
>
More information about the Numpy-discussion
mailing list