[Numpy-discussion] bug with assignment into an indexed array?
Mark Wiebe
mwwiebe@gmail....
Sat Aug 13 19:17:32 CDT 2011
On Thu, Aug 11, 2011 at 1:37 PM, Benjamin Root <ben.root@ou.edu> wrote:
> On Thu, Aug 11, 2011 at 10:33 AM, Olivier Delalleau <shish@keba.be> wrote:
>
>> 2011/8/11 Benjamin Root <ben.root@ou.edu>
>>
>>>
>>>
>>> On Thu, Aug 11, 2011 at 8:37 AM, Olivier Delalleau <shish@keba.be>wrote:
>>>
>>>> Maybe confusing, but working as expected.
>>>>
>>>>
>>>> When you write:
>>>> matched_to[np.array([0, 1, 2])] = 3
>>>> it calls __setitem__ on matched_to, with arguments (np.array([0, 1, 2]),
>>>> 3). So numpy understand you want to write 3 at these indices.
>>>>
>>>>
>>>> When you write:
>>>> matched_to[:3][match] = 3
>>>> it first calls __getitem__ with the slice as argument, which returns a
>>>> view of your array, then it calls __setitem__ on this view, and it fills
>>>> your matched_to array at the same time.
>>>>
>>>>
>>>> But when you write:
>>>> matched_to[np.array([0, 1, 2])][match] = 3
>>>> it first calls __getitem__ with the array as argument, which retunrs a
>>>> *copy* of your array, so that calling __setitem__ on this copy has no effect
>>>> on your original array.
>>>>
>>>> -=- Olivier
>>>>
>>>>
>>> Right, but I guess my question is does it *have* to be that way? I guess
>>> it makes some sense with respect to indexing with a numpy array like I did
>>> with the last example, because an element could be referred to multiple
>>> times (which explains the common surprise with '+='), but with boolean
>>> indexing, we are guaranteed that each element of the view will appear at
>>> most once. Therefore, shouldn't boolean indexing always return a view, not
>>> a copy? Is the general case of arbitrary array selection inherently
>>> impossible to encode in a view versus a slice with a regular spacing?
>>>
>>
>> Yes, due to the fact the array interface only supports regular spacing
>> (otherwise it is more difficult to get efficient access to arbitrary array
>> positions).
>>
>> -=- Olivier
>>
>>
> This still bothers me, though. I imagine that it is next to impossible to
> detect this situation from numpy's perspective, so it can't even emit a
> warning or error. Furthermore, for someone who makes a general function to
> modify the contents of some externally provided array, there is a
> possibility that the provided array is actually a copy not a view.
> Although, I guess it is the responsibility of the user to know the
> difference.
>
> I guess that is the key problem. The key advantage we are taught about
> numpy arrays is the use of views for efficient access. It would seem that
> most access operations would use it, but in reality, only sliced access do.
> Everything else is a copy (unless you are doing fancy indexing with
> assignment). Maybe with some of the forthcoming changes that have been done
> with respect to nditer and ufuncs (in particular, I am thinking of the
> "where" kwarg), maybe we could consider an enhancement allowing fancy
> indexing (or at least boolean indexing) to produce a view? Even if it is
> less efficient than a view from slicing, it would bring better consistency
> in behavior between the different forms of indexing.
>
> Just my 2 cents,
> Ben Root
>
I think it would be nice to evolve the NumPy indexing and array
representation towards the goal of indexing returning a view in all cases
with no exceptions. This would provide a much nicer mental model to program
with. Accomplishing such a transition will take a fair bit of time, though.
-Mark
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20110813/a867f5f7/attachment-0001.html
More information about the NumPy-Discussion
mailing list