[Numpy-discussion] Histograms via indirect index arrays

Tim Hochberg tim.hochberg at cox.net
Fri Mar 17 13:50:02 CST 2006


Piotr Luszczek wrote:

>On Friday 17 March 2006 14:43, Tim Hochberg wrote:
>  
>
>>Piotr Luszczek wrote:
>>    
>>
>>>On Friday 17 March 2006 13:29, Travis Oliphant wrote:
>>>      
>>>
>>>>Piotr Luszczek wrote:
>>>>        
>>>>
>>>>>>Yes, this is intended (sort of --- this particular example isn't
>>>>>>the reason for the behavior though).
>>>>>>
>>>>>>The issue is that the code g[idx] +=1 is equivalent in Python to
>>>>>>
>>>>>>g[idx] = g[idx] + 1
>>>>>>            
>>>>>>
>>>>>This is not what I read at:
>>>>>
>>>>>http://docs.python.org/ref/numeric-types.html
>>>>>
>>>>>Quote:
>>>>>
>>>>>These methods should attempt to do the operation in-place
>>>>>(modifying self) and return the result (which could be, but does
>>>>>not have to be, self).
>>>>>
>>>>>What you describe is "lack of attempt" in which case the "fall
>>>>>back" behavior gets triggered.
>>>>>          
>>>>>
>>>>The problems is that this explanation is very clear when we are
>>>>talking about the syntax
>>>>
>>>>g += 1
>>>>
>>>>Then, there is a method of g that can be over-ridden to get the
>>>>desired behavior.  Now, what you are talking about is "indexing
>>>>followed by in-place addition".
>>>>
>>>>i.e.  at issue is
>>>>
>>>>how does python interpret
>>>>
>>>>g[idx] += 1
>>>>
>>>>How does this get compiled to byte-code?
>>>>
>>>>There are two possibilities:
>>>>
>>>>1) g[idx] creates a new object which then has 1 added to it using
>>>>in-place addition.
>>>>
>>>>    This would not produce the desired behavior as g[idx] is a
>>>>copy of the data when idx is a
>>>>     general indexing array as it is in this case.  So, you make a
>>>>copy of those indices, add 1 to them
>>>>     and then do what with the resut?
>>>>
>>>>2) g[idx] += 1  gets converted to g[idx] = g[idx] + 1
>>>>
>>>>   This appears to be effectively what Python actually does. 
>>>>Notice that there is no way for us to control this behavior
>>>>because there is no __inplace_with_indexing_add__ operator to
>>>>over-ride.
>>>>
>>>>There is no such single operation to over-ride for the object.   In
>>>>other words, I don't see anyay for us to even alter the object to
>>>>get the behavior you want from that syntax.  We can, of course,
>>>>add a function or method to do that, but I we would have to extend
>>>>Python to get the behavior you want here.
>>>>        
>>>>
>>>Hardly. At least from what I'm seeing happens on a small example.
>>>'g[idx] += 1' becomes ('g' and 'idx' are generic objects):
>>>__getitem__(self, idx)
>>>__iadd__(1)
>>>__setitem__(result of __iadd__)
>>>
>>>By design numpy returns views from __getitem__
>>>      
>>>
>>You are missing that idx is not a slice index. rather it is an index
>>array (or list). In this case g[idx] does *not* return a view, it
>>returns a copy. From this everything else follows.
>>    
>>
>
>You are missing that indexing with array doesn't have to make
>a copy.
>
No I'm not.

> From this everythin else follows.
>In other words, It's a design decision that slices give views and
>indirection with arrays (I don't care for lists) gives a copy.
>I question things on conceptual level: both objects (the array an
>the indexing array) are from numpy. What code would break
>if the operation didn't return a copy?
>
>  
>
>>Conceivably g[idx] could return a psuedo-array object like flat does,
>>but I suspect that might have bad performance characteristics.
>>    
>>
As I said conceivably it could be done without making a copy. However 
that would have performance implications; some good and some bad. It's 
almost certain that some code would break too, although I don't think 
that's much of a concern if someone got this in before numpy 1.0. We'll 
probably never know how well or ill this performs unless someone steps 
up and trys to implement this. I'm not interested. Are you?

-tim

>>-tim
>>
>>    
>>
>>>In this case, it would be view into 'self' and 'idx' so the __iadd__
>>>would just use the 'idx' directly rather than a copy.
>>>Finally, __setitem__ doesn't do anything since 'self' and 'value'
>>>will be the same.
>>>
>>>Of course, this is just a quick draft. I don't know how it would
>>>work in practice and in other cases.
>>>
>>>      
>>>
>>>>Note, however, if idx is slice syntax, then the operation is done
>>>>without making copies because, for example,
>>>>
>>>>g[1:10:2]
>>>>
>>>>returns a "view" of the data (an array that shares the same memory)
>>>>as the original array.  Thus,
>>>>
>>>>g[1:10:2] += 1
>>>>
>>>>does an inplace add without data-copying.
>>>>
>>>>It may be possible to do what you want using zero-strided arrays,
>>>>but I'll leave that for a more motivated contributor.
>>>>
>>>>-Travis
>>>>        
>>>>
>>>-------------------------------------------------------
>>>This SF.Net email is sponsored by xPML, a groundbreaking scripting
>>>language that extends applications into web and mobile media.
>>>Attend the live webcast and join the prime developer group breaking
>>>into this new coding territory!
>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=1
>>>21642 _______________________________________________
>>>Numpy-discussion mailing list
>>>Numpy-discussion at lists.sourceforge.net
>>>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>>>      
>>>
>>-------------------------------------------------------
>>This SF.Net email is sponsored by xPML, a groundbreaking scripting
>>language that extends applications into web and mobile media. Attend
>>the live webcast and join the prime developer group breaking into
>>this new coding territory!
>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121
>>642 _______________________________________________
>>Numpy-discussion mailing list
>>Numpy-discussion at lists.sourceforge.net
>>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>>    
>>
>
>
>  
>






More information about the Numpy-discussion mailing list