[Numpy-discussion] Re: Histograms via indirect index arrays

Robert Kern robert.kern at gmail.com
Fri Mar 17 13:06:01 CST 2006


Piotr Luszczek wrote:
> On Friday 17 March 2006 14:58, Robert Kern wrote:
> 
>>Piotr Luszczek wrote:

>>>By design numpy returns views from __getitem__
>>
>>Only for slices.
>>
>>In [132]: a = arange(10)
>>
>>In [133]: idx = [2,2,3]
>>
>>In [134]: a[idx]
>>Out[134]: array([2, 2, 3])
>>
>>In [135]: b = a[idx]
>>
>>In [136]: b[-1] = 100
>>
>>In [137]: a
>>Out[137]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
> 
> Your example uses lists as indices. This is not interesting.
> I'm talking solely about arrays indexing other arrays.
> To me it is a special and very important case.

The result is exactly the same.

In [164]: a = arange(10)

In [165]: idx = array([2,2,3])

In [166]: b = a[idx]

In [167]: b[-1] = 100

In [168]: a
Out[168]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

>>>In this case, it would be view into 'self' and 'idx' so the
>>>__iadd__ would just use the 'idx' directly rather than a copy.
>>>Finally, __setitem__ doesn't do anything since 'self' and 'value'
>>>will be the same.
>>
>>No, value is the result of __iadd__ on the temporary array.
>>
>>'g[idx] += 1' expands to:
>>
>>  tmp = g.__getitem__(idx)
>>  val = tmp.__iadd__(1)
>>  g.__setitem__(idx, val)
> 
> You're missing the point. 'tmp' can be of a very specific type
> so that 'g.__setitem__' doesn't have to do anything: the 'add 1'
> was done by '__iadd__'.

No, I got your point just fine; I was correcting a detail.

You would have to reimplement __getitem__ to return a new kind of object that
represents a non-uniformly-strided array. If you want to get anywhere, go
implement that object and come back. When we have something concrete to look at
instead of vague assertions, then we can start tackling the issues of
integrating it into the core such that 'g[idx] += 1' works like you want it to.
For example, index arrays are used in more places than in-place addition. Your
new type needs to be usable in all of those places since __getitem__, __iadd__
and __setitem__ don't know that they are being called in that order and in that
fashion.

>>Given these class definitions:
>>
>>  class A(object):
>>      def __getitem__(self, idx):
>>          print 'A.__getitem__(%r)' % idx
>>          return B()
>>      def __setitem__(self, idx, value):
>>          print 'A.__setitem__(%r, %r)' % (idx, value)
>>
>>
>>  class B(object):
>>      def __iadd__(self, x):
>>          print 'B.__iadd__(%r)' % x
>>          return self
>>      def __repr__(self):
>>          return 'B()'
>>
>>In [153]: a = A()
>>
>>In [154]: a[[0, 2, 2, 1]] += 1
>>A.__getitem__([0, 2, 2, 1])
>>B.__iadd__(1)
>>A.__setitem__([0, 2, 2, 1], B())
>>
>>>Of course, this is just a quick draft. I don't know how it would
>>>work in practice and in other cases.
>>
>>Aye, there's the rub.
> 
> Show me a code that breaks.

<shrug> Show us some code that works. I'm not interested in implementing your
feature request. You are. There's plenty of work that you can do that doesn't
depend on anyone else agreeing with you, so you can stop arguing and start coding.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco





More information about the Numpy-discussion mailing list