# [Numpy-discussion] use index array of len n to select columns of n x m array

Martin Spacek numpy@mspacek.mm...
Fri Aug 6 15:11:55 CDT 2010

```On 2010-08-06 06:57, Keith Goodman wrote:
> You can speed it up by getting rid of two copies:
>
> idx = np.arange(a.shape[0])
> idx *= a.shape[1]
> idx += i

operating in-place. Here's my new version:

def rowtake(a, i):
"""For each row in a, return values according to column indices in the
corresponding row in i. Returned shape == i.shape"""
assert a.ndim == 2
assert i.ndim <= 2
if i.ndim == 1:
j = np.arange(a.shape[0])
else: # i.ndim == 2
j = np.repeat(np.arange(a.shape[0]), i.shape[1])
j.shape = i.shape
j *= a.shape[1]
j += i
return a.flat[j]

>>> a = np.arange(20)
>>> a.shape = 5, 4
>>> a
array([[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19]])
>>> i = np.array([[2, 1],
[3, 1],
[1, 1],
[0, 0],
[3, 1]])
>>> timeit rowtake(a, i)
100000 loops, best of 3: 14.7 us per loop
>>> timeit rowtake_cy(a, i)
100000 loops, best of 3: 10.6 us per loop

So now it's almost as fast as the element-by-element Cython version.

On 2010-08-06 03:29, josef.pktd@gmail.com wrote:
> I still find broadcasting easier to read, even if it might be a bit slower
>
>>>> a[np.arange(5)[:,None], i]
> array([[ 2,  1],
>        [ 7,  5],
>        [ 9,  9],
>        [12, 12],
>        [19, 17]])

Josef, I'd forgotten you could use None to increase the dimensionality of an
array. Neat. And, somehow, it's almost twice as fast as the Cython version!:

>>> timeit a[np.arange(a.shape[0])[:, None], i]
100000 loops, best of 3: 5.76 us per loop

I like it. Thanks for all the help!

Martin

```