[Numpy-discussion] Logical indexing and higher-dimensional arrays.
Tue Feb 7 23:01:30 CST 2012
On Feb 7, 2012, at 12:24 PM, Sturla Molden wrote:
> On 07.02.2012 19:17, Benjamin Root wrote:
>>>>> print x.shape
>> (2, 3, 4)
>>>>> print x[0, :, :].shape
>> (3, 4)
>>>>> print x[0, :, idx].shape
>> (2, 3)
> That looks like a bug to me. The length of the first dimension should be
> the same.
What you are probably expecting is (3,2) for this selection, but whenever you have ':' dimensions in-between "fancy-indexing", the rules that govern fancy-indexing are ambiguous in general about how to handle this case. In this specific case (with a scalar being broadcast against the idx) it is pretty clear what to do, and I consider it a bug that a special case for this situation is not there.
Recall that the shape of the output with fancy indexing is determined by broadcasting together the indexing objects and using that as the shape of the output:
x[ind1, ind2] will produce an output with the shape of "broadcast(ind1, ind2)" whose elements are selected by the broadcasted tuple. When this is combined with standard slicing like so: x[ind1, :, ind2], the question is what should the shape of the output me. If ind1 is a scalar there is no ambiguity (and this should be special cased --- but unfortunately isn't). If ind1 is not a scalar, then what should the shape be under the rules of "zip-based" indexing. I don't know. So, in fact, what happens is that the broadcasted shape is determined and used as the "first part" of the shape. The "second part" of the shape is the shape of the slice-based selection.
So, in this case the (0 and idx) broadcast to the (2,) part of the shape which is placed at the first of the result. The last part of the shape is the middle dimension (3,) resulting in the final shape (2,3).
It could be argued that, in fact, this is a good example of why fancy indexing should follow cross-product semantics, and the current zip-based semantics should be moved to a method --- where the difficult-to-understand behavior with intermediate slices is also harder to spell because you have to explicitly create slice objects with "slice". What do others think? Obviously this couldn't change immediately, but it could be on the road-map for NumPy 2.0 or later.
More information about the NumPy-Discussion