[Numpy-discussion] indexed arrays ignoring duplicates

Robert Kern robert.kern@gmail....
Wed Sep 29 22:17:18 CDT 2010


On Wed, Sep 29, 2010 at 20:11,  <josef.pktd@gmail.com> wrote:
> On Wed, Sep 29, 2010 at 9:03 PM, Damien Morton <dmorton@bitfurnace.com> wrote:
>> On Thu, Sep 30, 2010 at 3:28 AM, Robert Kern <robert.kern@gmail.com> wrote:
>>> On Wed, Sep 29, 2010 at 12:00, Pauli Virtanen <pav@iki.fi> wrote:
>>>> Wed, 29 Sep 2010 11:15:08 -0500, Robert Kern wrote:
>>>> [clip: inplace addition with duplicates]
>>>>> Use numpy.bincount() instead.
>>>>
>>>> It might be worthwhile to add a separate helper function for this
>>>> purpose. Bincount makes a copy that could be avoided, and it is difficult
>>>> to find if you don't know about this trick.
>>>
>>> I'm fairly certain that most of the arrays used are fairly small, as
>>> such things are reckoned. I'm not sure that in-place modification
>>> would win us much. And I'm not sure what other name for the function
>>> would make it easier to find. AFAICT, using bincount() this way is not
>>> really a "trick"; it's just the right way to do exactly this job. If
>>> anything, "x.fill(0);x[i] += 1" is the "trick".
>>
>> bincount only works for gathering/accumulating scalars. Even the
>> 'weights' parameter is limited to scalars.
>
> Do you mean that bincount only works with 1d arrays? I also think that
> this is a major limitation of it.

Feel free to change it. I think that extending the weights array to
allow greater dimensions is an obvious extension of the current
semantics.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


More information about the NumPy-Discussion mailing list