[Numpy-discussion] Numpy Advanced Indexing Question

Robert Kern robert.kern@gmail....
Thu Jul 17 03:39:46 CDT 2008


On Thu, Jul 17, 2008 at 03:16, Stéfan van der Walt <stefan@sun.ac.za> wrote:
> Hi Robert
>
> 2008/7/17 Robert Kern <robert.kern@gmail.com>:
>> In [42]: smallcube = cube[idx_i,idx_j,idx_k]
>
> Fantastic -- a good way to warm up the brain-circuit in the morning!
> Is there an easy-to-remember rule that predicts the output shape of
> the operation above?  I'm trying to imaging how the output would
> change if I altered the dimensions of idx_i or idx_j, but it's hard.

Like I said, they all get broadcasted against each other. The final
output is the shape of the broadcasted index arrays and takes values
found by iterating in parallel over those broadcasted index arrays.

> It looks like you can do all sorts of interesting things by
> manipulation the indices.  For example, if I take
>
> In [137]: x = np.arange(12).reshape((3,4))
>
> I can produce either
>
> In [138]: x[np.array([[0,1]]), np.array([[1, 2]])]
> Out[138]: array([[1, 6]])
>
> or
>
> In [140]: x[np.array([[0],[1]]), np.array([[1], [2]])]
> Out[140]:
> array([[1],
>       [6]])
>
> and even
>
> In [141]: x[np.array([[0],[1]]), np.array([[1, 2]])]
> Out[141]:
> array([[1, 2],
>       [5, 6]])
>
> or its transpose
>
> In [143]: x[np.array([[0,1]]), np.array([[1], [2]])]
> Out[143]:
> array([[1, 5],
>       [2, 6]])
>
> Is it possible to separate the indexing in order to understand it
> better?  My thinking was
>
> cube_i = cube[idx_i,:,:].squeeze()
> cube_j = cube_i[:,idx_j,:].squeeze()
> cube_k = cube_j[:,:,idx_k].squeeze()
>
> Not sure what would happen if the original array had single dimensions, though.

You'd have a problem.

So the way fancy indexing interacts with slices is a bit tricky, and
this is why we couldn't use the nicer syntax of cube[:,:,idx_k]. All
axes with fancy indices are collected together. Their index arrays are
broadcasted and iterated over. *For each iterate*, all of the slices
are collected, and those sliced axes are *added* to the output array.
If you had used fancy indexing on all of the axes, then the iterate
would be a scalar value pulled from the original array. If you mix
fancy indexing and slices, the iterate is the *array* formed by the
remaining slices.

So if idx_k is shaped (ni,nj,3), for example, cube[:,:,idx_k] will
have the shape (ni,nj,ni,nj,3). So
smallcube[:,:,i,j,k]==cube[:,:,idx_k[i,j,k]].

Is that clear, or am I obfuscating the subject more?

> Back to the original problem:
>
> In [127]: idx_i.shape
> Out[127]: (10, 1, 1)
>
> In [128]: idx_j.shape
> Out[128]: (1, 15, 1)
>
> In [129]: idx_k.shape
> Out[129]: (10, 15, 7)
>
> For the constant slice case, I guess idx_k also have been (1, 1, 7)?
>
> The construction of the cube could probably be done using only
>
> cube.flat = np.arange(nk)

Yes, but only due to a weird feature of assigning to .flat. If the RHS
is too short, it gets repeated. Since the last axis is contiguous,
repeating arange(nk) happily coincides with the desired result of
cube[i,j] == arange(nk) for all i,j. This won't check the size,
though. If I give it cube.flat=np.arange(nk+1), it will repeat that
array just fine, although it doesn't line up.

cube[:,:,:]=np.arange(nk), on the other hand broadcasts the RHS to the
shape of cube, then does the assignment. If the RHS cannot be
broadcasted to the right shape (in this case because it is not the
same length as the final axis of the LHS), an error is raised. I find
the reuse of the broadcasting concept to be more memorable, and robust
over the (mostly) ad hoc use of plain repetition with .flat.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco


More information about the Numpy-discussion mailing list