[Numpy-discussion] masked index surprise

Keith Goodman kwgoodman@gmail....
Fri Aug 14 14:45:44 CDT 2009


On Fri, Aug 14, 2009 at 12:24 PM, Robert Kern<robert.kern@gmail.com> wrote:
> On Fri, Aug 14, 2009 at 14:20, Keith Goodman<kwgoodman@gmail.com> wrote:
>> On Fri, Aug 14, 2009 at 11:52 AM, Robert Kern<robert.kern@gmail.com> wrote:
>>> On Fri, Aug 14, 2009 at 13:05, John Hunter<jdh2358@gmail.com> wrote:
>>>> I just tracked down a subtle bug in my code, which is equivalent to
>>>>
>>>>
>>>> In [64]: x, y = np.random.rand(2, n)
>>>>
>>>> In [65]: z = np.zeros_like(x)
>>>>
>>>> In [66]: mask = x>0.5
>>>>
>>>> In [67]: z[mask] = x/y
>>>>
>>>>
>>>>
>>>> I meant to write
>>>>
>>>>  z[mask] = x[mask]/y[mask]
>>>>
>>>> so I can fix my code, but why is line 67 allowed
>>>>
>>>>  In [68]: z[mask].shape
>>>>  Out[68]: (54,)
>>>>
>>>>  In [69]: (x/y).shape
>>>>  Out[69]: (100,)
>>>>
>>>> it seems like broadcasting would fail
>>>
>>> Broadcasting doesn't take place with boolean masks. Instead, the
>>> values repeat if there are too few and extra values are ignored.
>>> Boolean indexing derives from Numeric's putmask() implementation,
>>> which had these semantics, rather than other forms of indexing.
>>>
>>> You may consider this a wart or a bad design decision (and I would
>>> probably agree), but it is not a bug.
>>
>> Are the last two, x[[1]] and x[np.array([1])], broadcasting?
>>
>>>> x = np.array([1,2,3])
>>>> x[1] = np.array([4,5,6])
>> ValueError: setting an array element with a sequence.
>>>> x[(1,)] = np.array([4,5,6])
>> ValueError: array dimensions are not compatible for copy
>>>> x[[1]] = np.array([4,5,6])
>>>> x
>>   array([1, 4, 3])
>>>> x[np.array([1])] = np.array([4,5,6])
>>>> x
>>   array([1, 4, 3])
>
> I guess I'm just makin' stuff up again. kern_is_right() == False. All
> forms repeat, not broadcast, since they derive from put() and
> putmask() which both have the repeating/ignoring semantics.

The ignoring scares me. If the dimensions aren't compatible I'd much
rather get a ValueError. Does anyone have a use case for ignoring?
(Besides ignoring my email.)


More information about the NumPy-Discussion mailing list