[SciPy-user] Creating a Matrix from a Sum

David Warde-Farley dwf@cs.toronto....
Fri Jul 3 14:01:44 CDT 2009


On 3-Jul-09, at 2:45 PM, David Warde-Farley wrote:

> On 2-Jul-09, at 9:25 AM, Lorenzo Isella wrote:
>
>> Dear All,
>> I need some help to efficiently write an array manipulation. I  
>> suspect
>> this could be a one-liner.
>> Assume that you have a histogram of observations that you store in a
>> vector x.
>> Let us say that its i-th entry, x_i, corresponds to the number of
>> observations in the i-th channel, for i=1,2...N. (or 0,1...N-1,  
>> please
>> let me know if there is any potential 0/1 pitfall in the following).
>> Now, for any two channels i and j, I want to calculate the  
>> probability
>> of having an observation  in any other channel k, where  k>=max(i,j).
>> That is to say
>>
>> P(i,j)=sum_{k=max(i,j)}^N x_k /C,
>> where C=sum_{i=1}^N x_i is just a normalization factor.
>
> Hm, there might be a simpler/more efficient way but this is what I
> came up with as far as 1-liners:
>
> P = np.max(np.concatenate(np.broadcast_arrays(x[np.newaxis, :,
> np.newaxis], x[:, np.newaxis, np.newaxis]),axis=2),axis=2); P /=  
> x.sum()

Oops, nevermind, that's wrong.

Here's how I would do it.

You want the sum from k to the end (N) so I'd extract a vector like  
this (the cumulative sum of the reversed array, reversed again

	sums = np.cumsum(x[::-1])[::-1]

Then you want the max of i and j at each position in the matrix. For  
this, though there may be a more efficient way, you can use mgrid to  
get i and j indices at each position in the matrix and then use  
concatenate and max on them to concatenate along a third axis and then  
do a max along that same axis to recover a 2D array:

	idx = np.concatenate([M[:,:,np.newaxis] for M in mgrid[0:len(x),  
0:len(x)]],axis=2).max(axis=2)

Finally, use fancy indexing and divide by your normalizer:

	P = sums[idx]  / x.sum()

Or, as a oneliner, much less readable:

	P = cumsum(x[::-1])[::-1][np.concatenate([M[:,:,np.newaxis] for M in  
mgrid[0:len(x), 0:len(x)]],axis=2).max(axis=2)] / x.sum()

HTH,

David


More information about the SciPy-user mailing list