[Numpy-discussion] indexed arrays ignoring duplicates

josef.pktd@gmai... josef.pktd@gmai...
Wed Sep 29 23:07:28 CDT 2010


On Wed, Sep 29, 2010 at 11:24 PM, Damien Morton <dmorton@bitfurnace.com> wrote:
> On Thu, Sep 30, 2010 at 11:11 AM,  <josef.pktd@gmail.com> wrote:
>>> bincount only works for gathering/accumulating scalars. Even the
>>> 'weights' parameter is limited to scalars.
>>
>> Do you mean that bincount only works with 1d arrays? I also think that
>> this is a major limitation of it.
>
>>>> from numpy import *
>>>> a = array((1,2,2,3,3))
>>>> w = array(((1,2),(3,4),(5,6),(7,8),(9,10)))
>>>> bincount(a,weights=w)
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> ValueError: object too deep for desired array
>>>> w0 = array((1,2,3,4,5))
>>>> bincount(a,weights=w0)
> array([ 0.,  1.,  5.,  9.])

Since I'm not a C person to change bincount, how about

>>> a = np.array((1,2,2,3,3))
>>> w = np.array(((1,2),(3,4),(5,6),(7,8),(9,10)))
>>> a2 = np.array((1,2,2,3,3))[:,None]-1 + np.array([0, a.max()])
>>> a
array([1, 2, 2, 3, 3])
>>> w
array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]])
>>> np.bincount(a2.ravel(),weights=w.ravel()).reshape(2,-1).T
array([[  1.,   2.],
       [  8.,  10.],
       [ 16.,  18.]])

I never thought of doing this before and I have been using bincount
for some time.


>
>>> I propose the name 'gather()' for the helper function that does this.
>>
>> I don't think "gather" is an obvious name to search for.
>
> "gather" is the name that the GPGPU community uses to describe this
> kind of operation. Not just for summation but for any kind of indexed
> reducing operation.

Some group functions that Travis is planning, might go in this direction.

Josef

> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


More information about the NumPy-Discussion mailing list