[Numpy-discussion] Broadcasting and indexing
Fri Jan 22 08:51:13 CST 2010
On Thu, Jan 21, 2010 at 1:03 PM, Emmanuelle Gouillart
> Hi Thomas,
> broadcasting rules are only for ufuncs (and by extension, some numpy
> functions using ufuncs). Indexing obeys different rules and always starts
> by the first dimension.
Just a clarification: If there are several index arrays, then standard
broadcasting rules apply for them. It's a bit messier when arrays and
slice objects are mixed.
An informative explanation was in the thread March 2009 about "Is this
a bug?" and lots of examples are on the mailing list
> However, you don't have to use broadcasting for such indexing operations:
>>>> a[:, c] = 0
> zeroes columns indexed by c.
> If you want to index along the 3rd dimension, you can use a[:, :, c],
> etc. If the dimension along which you index is a variable, you can also
> use the function np.rollaxis that allows to change the order of the
> dimensions of an array. You may then index along the first dimension
> (a[c]), then change back the order of the dimensions. Here is an example:
>>>> a = np.ones((3,4,5,6))
>>>> c = np.array([1,0,1,0,1], dtype=bool)
>>>> tmp_a = np.rollaxis(a, 2, 0)
> (5, 3, 4, 6)
>>>> tmp_a[c] = 0
>>>> a = np.rollaxis(tmp_a, 0, 3)
> (3, 4, 5, 6)
> Hope this helps.
> On Thu, Jan 21, 2010 at 11:37:09AM -0500, Thomas Robitaille wrote:
>> I'm trying to understand how array broadcasting can be used for indexing. In the following, I use the term 'row' to refer to the first dimension of a 2D array, and 'column' to the second, just because that's how numpy prints them out.
>> If I consider the following example:
>> >>> a = np.random.random((4,5))
>> >>> b = np.random.random((5,))
>> >>> a + b
>> array([[ 1.45499556, 0.60633959, 0.48236157, 1.55357393, 1.4339261 ],
>> [ 1.28614593, 1.11265001, 0.63308615, 1.28904227, 1.34070499],
>> [ 1.26988279, 0.84683018, 0.98959466, 0.76388223, 0.79273084],
>> [ 1.27859505, 0.9721984 , 1.02725009, 1.38852061, 1.56065028]])
>> I understand how this works, because it works as expected as described in
>> So b gets broadcast to shape (1,5), then because the first dimension is 1, the operation is applied to all rows.
>> Now I am trying to apply this to array indexing. So for example, I want to set specific columns, indicated by a boolean array, to zero, but the following fails:
>> >>> c = np.array([1,0,1,0,1], dtype=bool)
>> >>> a[c] = 0
>> Traceback (most recent call last):
>> File "<stdin>", line 1, in <module>
>> IndexError: index (4) out of range (0<=index<3) in dimension 0
>> However, if I try reducing the size of c to 4, then it works, and sets rows, not columns, equal to zero
>> >>> c = np.array([1,0,1,0], dtype=bool)
>> >>> a[c] = 0
>> >>> a
>> array([[ 0. , 0. , 0. , 0. , 0. ],
>> [ 0.41526315, 0.7425491 , 0.39872546, 0.56141914, 0.69795153],
>> [ 0. , 0. , 0. , 0. , 0. ],
>> [ 0.40771227, 0.60209749, 0.7928894 , 0.66089748, 0.91789682]])
>> But I would have thought that the indexing array would have been broadcast in the same way as for a sum, i.e. c would be broadcast to have dimensions (1,5) and then would have been able to set certain columns in all rows to zero.
>> Why is it that for indexing, the broadcasting seems to happen in a different way than when performing operations like additions or multiplications? For background info, I'm trying to write a routine which performs a set of operations on an n-d array, where n is not known in advance, with a 1D array, so I can use broadcasting rules for most operations without knowing the dimensionality of the n-d array, but now that I need to perform indexing, and the convention seems to change, this is a real issue.
>> Thanks in advance for any advice,
>> NumPy-Discussion mailing list
> NumPy-Discussion mailing list
More information about the NumPy-Discussion