[SciPy-User] Accumulation sum using indirect indexes

josef.pktd@gmai... josef.pktd@gmai...
Thu Feb 2 13:01:10 CST 2012


On Thu, Feb 2, 2012 at 1:16 PM, Warren Weckesser
<warren.weckesser@enthought.com> wrote:
>
>
> On Wed, Feb 1, 2012 at 10:34 AM, Alexander Kalinin <alec.kalinin@gmail.com>
> wrote:
>>
>> Yes, but for large data sets loops is quite slow. I have tried Pandas
>> groupby.sum() and it works faster.
>>
>
>
> Pandas is probably the correct tool to use for this, but it will be nice
> when numpy has a native "group-by" capability.
>
> For what its worth (had to scratch the itch, so to speak), the attached
> script provides a "pure numpy" implementation without a python loop.  The
> output of the script is
>
> In [53]: run pseudo_group_by.py
> Label   Data
>  20    [1 2 3]
>  20    [1 2 4]
>  10    [3 3 1]
>   0    [5 0 0]
>  20    [1 9 0]
>  10    [2 3 4]
>  20    [9 9 1]
>
> Label  Num.   Sum
>   0     1   [5 0 0]
>  10     2   [5 6 5]
>  20     4   [12 22  8]
>
>
> A drawback of the method is that it will make a reordered copy of the data.
> I haven't compared the performance to pandas.

nice use of reduceat, I found it recently in an example but haven't used it yet.
It looks convenient if labels are presorted and numeric.

Josef

>
> Warren
>
>
>>
>>
>> 2012/2/1 Frédéric Bastien <nouiz@nouiz.org>
>>>
>>> It will be slow, but you can make a python loop.
>>>
>>> Fred
>>>
>>> On Jan 31, 2012 3:34 PM, "Alexander Kalinin" <alec.kalinin@gmail.com>
>>> wrote:
>>>>
>>>> Hello!
>>>>
>>>> I use SciPy in computer graphics applications. My task is to calculate
>>>> vertex normals by averaging faces normals. In other words I want to
>>>> accumulate vectors with the same ids. For example,
>>>>
>>>> ids = numpy.array([0, 1, 1, 2])
>>>> n = numpy.array([ [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], [0.1, 0.1, 0.1],
>>>> [0.1, 0.1 0.1] ])
>>>>
>>>> I need result:
>>>> nv = ([ [0.1, 0.1, 0.1], [0.2, 0.2, 0.2], [0.1, 0.1, 0.1]])
>>>>
>>>> The most simple code:
>>>> nv[ids] += n
>>>> does not work, I know about this. For 1D arrays I use
>>>> numpy.bincount(...) function. But this function does not work for 2D arrays.
>>>>
>>>> So, my question. What is the best way calculate accumulation sum for 2D
>>>> arrays using indirect indexes?
>>>>
>>>> Sincerely,
>>>> Alexander
>>>>
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User@scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>
>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User@scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


More information about the SciPy-User mailing list