[Numpy-discussion] Histograms via indirect index arrays

Piotr Luszczek luszczek at cs.utk.edu
Fri Mar 17 14:20:01 CST 2006


On Friday 17 March 2006 16:49, Tim Hochberg wrote:
> Piotr Luszczek wrote:
> >On Friday 17 March 2006 14:43, Tim Hochberg wrote:
> >>Piotr Luszczek wrote:
> >>>On Friday 17 March 2006 13:29, Travis Oliphant wrote:
> >>>>Piotr Luszczek wrote:
> >>>>>>Yes, this is intended (sort of --- this particular example
> >>>>>> isn't the reason for the behavior though).
> >>>>>>
> >>>>>>The issue is that the code g[idx] +=1 is equivalent in Python
> >>>>>> to
> >>>>>>
> >>>>>>g[idx] = g[idx] + 1
> >>>>>
> >>>>>This is not what I read at:
> >>>>>
> >>>>>http://docs.python.org/ref/numeric-types.html
> >>>>>
> >>>>>Quote:
> >>>>>
> >>>>>These methods should attempt to do the operation in-place
> >>>>>(modifying self) and return the result (which could be, but does
> >>>>>not have to be, self).
> >>>>>
> >>>>>What you describe is "lack of attempt" in which case the "fall
> >>>>>back" behavior gets triggered.
> >>>>
> >>>>The problems is that this explanation is very clear when we are
> >>>>talking about the syntax
> >>>>
> >>>>g += 1
> >>>>
> >>>>Then, there is a method of g that can be over-ridden to get the
> >>>>desired behavior.  Now, what you are talking about is "indexing
> >>>>followed by in-place addition".
> >>>>
> >>>>i.e.  at issue is
> >>>>
> >>>>how does python interpret
> >>>>
> >>>>g[idx] += 1
> >>>>
> >>>>How does this get compiled to byte-code?
> >>>>
> >>>>There are two possibilities:
> >>>>
> >>>>1) g[idx] creates a new object which then has 1 added to it using
> >>>>in-place addition.
> >>>>
> >>>>    This would not produce the desired behavior as g[idx] is a
> >>>>copy of the data when idx is a
> >>>>     general indexing array as it is in this case.  So, you make
> >>>> a copy of those indices, add 1 to them
> >>>>     and then do what with the resut?
> >>>>
> >>>>2) g[idx] += 1  gets converted to g[idx] = g[idx] + 1
> >>>>
> >>>>   This appears to be effectively what Python actually does.
> >>>>Notice that there is no way for us to control this behavior
> >>>>because there is no __inplace_with_indexing_add__ operator to
> >>>>over-ride.
> >>>>
> >>>>There is no such single operation to over-ride for the object.  
> >>>> In other words, I don't see anyay for us to even alter the
> >>>> object to get the behavior you want from that syntax.  We can,
> >>>> of course, add a function or method to do that, but I we would
> >>>> have to extend Python to get the behavior you want here.
> >>>
> >>>Hardly. At least from what I'm seeing happens on a small example.
> >>>'g[idx] += 1' becomes ('g' and 'idx' are generic objects):
> >>>__getitem__(self, idx)
> >>>__iadd__(1)
> >>>__setitem__(result of __iadd__)
> >>>
> >>>By design numpy returns views from __getitem__
> >>
> >>You are missing that idx is not a slice index. rather it is an
> >> index array (or list). In this case g[idx] does *not* return a
> >> view, it returns a copy. From this everything else follows.
> >
> >You are missing that indexing with array doesn't have to make
> >a copy.
>
> No I'm not.
>
> > From this everythin else follows.
> >In other words, It's a design decision that slices give views and
> >indirection with arrays (I don't care for lists) gives a copy.
> >I question things on conceptual level: both objects (the array an
> >the indexing array) are from numpy. What code would break
> >if the operation didn't return a copy?
> >
> >>Conceivably g[idx] could return a psuedo-array object like flat
> >> does, but I suspect that might have bad performance
> >> characteristics.
>
> As I said conceivably it could be done without making a copy. However
> that would have performance implications; some good and some bad.
> It's almost certain that some code would break too, although I don't
> think that's much of a concern if someone got this in before numpy
> 1.0. We'll probably never know how well or ill this performs unless
> someone steps up and trys to implement this. I'm not interested. Are
> you?

Yes. Very much so. To me it's a very compact and intuitive way
of encoding many things I do. Histogram is trivial but useful.
Sparse matrix calculations are more important to me.

I've raised this issue at least once before and didn't get much
response. I think it's better this time around. But ultimately I
agree with Robert Kern that I need to show some code and
timing data to make people pay attention.

> -tim
>
> >>-tim
> >>
> >>>In this case, it would be view into 'self' and 'idx' so the
> >>> __iadd__ would just use the 'idx' directly rather than a copy.
> >>>Finally, __setitem__ doesn't do anything since 'self' and 'value'
> >>>will be the same.
> >>>
> >>>Of course, this is just a quick draft. I don't know how it would
> >>>work in practice and in other cases.
> >>>
> >>>>Note, however, if idx is slice syntax, then the operation is done
> >>>>without making copies because, for example,
> >>>>
> >>>>g[1:10:2]
> >>>>
> >>>>returns a "view" of the data (an array that shares the same
> >>>> memory) as the original array.  Thus,
> >>>>
> >>>>g[1:10:2] += 1
> >>>>
> >>>>does an inplace add without data-copying.
> >>>>
> >>>>It may be possible to do what you want using zero-strided arrays,
> >>>>but I'll leave that for a more motivated contributor.
> >>>>
> >>>>-Travis
> >>>
> >>>-------------------------------------------------------
> >>>This SF.Net email is sponsored by xPML, a groundbreaking scripting
> >>>language that extends applications into web and mobile media.
> >>>Attend the live webcast and join the prime developer group
> >>> breaking into this new coding territory!
> >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=
> >>>1 21642 _______________________________________________
> >>>Numpy-discussion mailing list
> >>>Numpy-discussion at lists.sourceforge.net
> >>>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >>
> >>-------------------------------------------------------
> >>This SF.Net email is sponsored by xPML, a groundbreaking scripting
> >>language that extends applications into web and mobile media.
> >> Attend the live webcast and join the prime developer group
> >> breaking into this new coding territory!
> >>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=1
> >>21 642 _______________________________________________
> >>Numpy-discussion mailing list
> >>Numpy-discussion at lists.sourceforge.net
> >>https://lists.sourceforge.net/lists/listinfo/numpy-discussion




More information about the Numpy-discussion mailing list