[Numpy-discussion] is it a bug?

Travis E. Oliphant oliphant@enthought....
Thu Mar 12 22:43:13 CDT 2009


shuwj5460@163.com wrote:
>>
>> It's certainly weird, but it's working as designed. Fancy indexing via
>> arrays is a separate subsystem from indexing via slices. Basically,
>> fancy indexing decides the outermost shape of the result (e.g. the
>> leftmost items in the shape tuple). If there are any sliced axes, they
>> are *appended* to the end of that shape tuple.
>>
>>     
> x = np.arange(30)
> x.shape = (2,3,5)
>
> idx = np.array([0,1,3,4])
> e = x[:,:,idx]
> print e.shape
> #---> return (2,3,4) just as me think.
>
> e = x[0,:,idx]
> print e.shape
> #---> return (4,3). 
>
> e = x[:,0,idx]
> print e.shape
> #---> return (2,4). not (4,2). why these three cases excute so
> # differently?
>   

This is probably best characterized as a wart stemming from a use-case 
oversight in the approach created to handle mixing simple indexing and 
advanced indexing.

Basically, you can understand what happens by noting that when when 
scalars are used in combination with index arrays, they are treated as 
if they were part of an indexing array.  In other words 0 is interpreted 
as [0] (or 1 is interpreted as [1]) when combined with advanced 
indexing.  This is in part so that scalars will be broadcast to the 
shape of any indexing array to correctly handle indexing in other 
use-cases.

Then, when advanced indexing is combined with ':' or '...' some special 
rules show up in determining the output shape that have to do with 
resolving potential ambiguities.   It is arguable that the rules for 
resolving ambiguities are a bit simplistic and therefore don't handle 
some real use-cases very well like the case you show.   On the other 
hand, simple rules are better even if the rules about combining ':' and 
'...' and advanced indexing are not well-known.

So, to be a little more clear about what is going on, define idx2 = [0] 
and then ask what should the shapes of x[idx2, :, idx] and x[:, idx2, 
idx] be?   Remember that advanced indexing will broadcast idx2 and idx 
to the same shape ( in this case (4,) but they could broadcast to any 
shape at all).   This broadcasted result shape must be somehow combined 
with  the shape resulting from performing the slice selection.

With x[:, idx2, idx] it is unambiguous to tack the broadcasted shape to 
the end of the shape resulting from the slice-selection (i.e. 
x[:,0,0].shape).   This leads to the (2,4) result.

Now, what about x[idx2, :, idx]?   The idx2 and idx are still broadcast 
to the same shape which could be any shape (in this particular case it 
is (4,)), but the slice-selection is done "in the middle".   So, where 
should the shape of the slice selection (i.e. x[0,:,0].shape) be placed 
in the output shape?    At the time this is determined, there is no 
notion that idx2 "came from a scalar" and so it could have come from any 
array.   Therefore, when there is this kind of ambiguity, the code 
always places the broadcasted shape at the beginning.    Thus, the 
result is (4,) + (3,)  --> (4.3).

Perhaps it is a bit surprising in this particular case, but it is 
working as designed.    I admit that this particular asymmetry does 
create some cognitive dissonance which leaves something to be desired.  

-Travis




More information about the Numpy-discussion mailing list