[SciPy-User] Accumulation sum using indirect indexes

josef.pktd@gmai... josef.pktd@gmai...
Thu Feb 2 13:29:59 CST 2012


On Thu, Feb 2, 2012 at 2:11 PM, Travis Oliphant <travis@continuum.io> wrote:
>
> On Feb 2, 2012, at 1:01 PM, josef.pktd@gmail.com wrote:
>
>> On Thu, Feb 2, 2012 at 1:16 PM, Warren Weckesser
>> <warren.weckesser@enthought.com> wrote:
>>>
>>>
>>> On Wed, Feb 1, 2012 at 10:34 AM, Alexander Kalinin <alec.kalinin@gmail.com>
>>> wrote:
>>>>
>>>> Yes, but for large data sets loops is quite slow. I have tried Pandas
>>>> groupby.sum() and it works faster.
>>>>
>>>
>>>
>>> Pandas is probably the correct tool to use for this, but it will be nice
>>> when numpy has a native "group-by" capability.
>>>
>>> For what its worth (had to scratch the itch, so to speak), the attached
>>> script provides a "pure numpy" implementation without a python loop.  The
>>> output of the script is
>>>
>>> In [53]: run pseudo_group_by.py
>>> Label   Data
>>>  20    [1 2 3]
>>>  20    [1 2 4]
>>>  10    [3 3 1]
>>>   0    [5 0 0]
>>>  20    [1 9 0]
>>>  10    [2 3 4]
>>>  20    [9 9 1]
>>>
>>> Label  Num.   Sum
>>>   0     1   [5 0 0]
>>>  10     2   [5 6 5]
>>>  20     4   [12 22  8]
>>>
>>>
>>> A drawback of the method is that it will make a reordered copy of the data.
>>> I haven't compared the performance to pandas.
>>
>> nice use of reduceat, I found it recently in an example but haven't used it yet.
>> It looks convenient if labels are presorted and numeric.
>
> Reduceat is pretty convenient, but it's limited right now because you have to have contiguous fence-posts for your reductions.   There is a NEP with the group-by nep to make a reduce that takes in arbitrary index-ranges for reductions.

I have been looking forward for the group-by for a long time, but I
would also be happy with a bincount that takes a 2d or nd weights
matrix.

Josef

>
> -Travis
>
>
>>
>> Josef
>>
>>>
>>> Warren
>>>
>>>
>>>>
>>>>
>>>> 2012/2/1 Frédéric Bastien <nouiz@nouiz.org>
>>>>>
>>>>> It will be slow, but you can make a python loop.
>>>>>
>>>>> Fred
>>>>>
>>>>> On Jan 31, 2012 3:34 PM, "Alexander Kalinin" <alec.kalinin@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Hello!
>>>>>>
>>>>>> I use SciPy in computer graphics applications. My task is to calculate
>>>>>> vertex normals by averaging faces normals. In other words I want to
>>>>>> accumulate vectors with the same ids. For example,
>>>>>>
>>>>>> ids = numpy.array([0, 1, 1, 2])
>>>>>> n = numpy.array([ [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], [0.1, 0.1, 0.1],
>>>>>> [0.1, 0.1 0.1] ])
>>>>>>
>>>>>> I need result:
>>>>>> nv = ([ [0.1, 0.1, 0.1], [0.2, 0.2, 0.2], [0.1, 0.1, 0.1]])
>>>>>>
>>>>>> The most simple code:
>>>>>> nv[ids] += n
>>>>>> does not work, I know about this. For 1D arrays I use
>>>>>> numpy.bincount(...) function. But this function does not work for 2D arrays.
>>>>>>
>>>>>> So, my question. What is the best way calculate accumulation sum for 2D
>>>>>> arrays using indirect indexes?
>>>>>>
>>>>>> Sincerely,
>>>>>> Alexander
>>>>>>
>>>>>> _______________________________________________
>>>>>> SciPy-User mailing list
>>>>>> SciPy-User@scipy.org
>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> SciPy-User mailing list
>>>>> SciPy-User@scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User@scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>
>>>
>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User@scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


More information about the SciPy-User mailing list