[SciPy-User] Accumulation sum using indirect indexes

Travis Oliphant travis@continuum...
Thu Feb 2 13:11:55 CST 2012


On Feb 2, 2012, at 1:01 PM, josef.pktd@gmail.com wrote:

> On Thu, Feb 2, 2012 at 1:16 PM, Warren Weckesser
> <warren.weckesser@enthought.com> wrote:
>> 
>> 
>> On Wed, Feb 1, 2012 at 10:34 AM, Alexander Kalinin <alec.kalinin@gmail.com>
>> wrote:
>>> 
>>> Yes, but for large data sets loops is quite slow. I have tried Pandas
>>> groupby.sum() and it works faster.
>>> 
>> 
>> 
>> Pandas is probably the correct tool to use for this, but it will be nice
>> when numpy has a native "group-by" capability.
>> 
>> For what its worth (had to scratch the itch, so to speak), the attached
>> script provides a "pure numpy" implementation without a python loop.  The
>> output of the script is
>> 
>> In [53]: run pseudo_group_by.py
>> Label   Data
>>  20    [1 2 3]
>>  20    [1 2 4]
>>  10    [3 3 1]
>>   0    [5 0 0]
>>  20    [1 9 0]
>>  10    [2 3 4]
>>  20    [9 9 1]
>> 
>> Label  Num.   Sum
>>   0     1   [5 0 0]
>>  10     2   [5 6 5]
>>  20     4   [12 22  8]
>> 
>> 
>> A drawback of the method is that it will make a reordered copy of the data.
>> I haven't compared the performance to pandas.
> 
> nice use of reduceat, I found it recently in an example but haven't used it yet.
> It looks convenient if labels are presorted and numeric.

Reduceat is pretty convenient, but it's limited right now because you have to have contiguous fence-posts for your reductions.   There is a NEP with the group-by nep to make a reduce that takes in arbitrary index-ranges for reductions. 

-Travis


> 
> Josef
> 
>> 
>> Warren
>> 
>> 
>>> 
>>> 
>>> 2012/2/1 Frédéric Bastien <nouiz@nouiz.org>
>>>> 
>>>> It will be slow, but you can make a python loop.
>>>> 
>>>> Fred
>>>> 
>>>> On Jan 31, 2012 3:34 PM, "Alexander Kalinin" <alec.kalinin@gmail.com>
>>>> wrote:
>>>>> 
>>>>> Hello!
>>>>> 
>>>>> I use SciPy in computer graphics applications. My task is to calculate
>>>>> vertex normals by averaging faces normals. In other words I want to
>>>>> accumulate vectors with the same ids. For example,
>>>>> 
>>>>> ids = numpy.array([0, 1, 1, 2])
>>>>> n = numpy.array([ [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], [0.1, 0.1, 0.1],
>>>>> [0.1, 0.1 0.1] ])
>>>>> 
>>>>> I need result:
>>>>> nv = ([ [0.1, 0.1, 0.1], [0.2, 0.2, 0.2], [0.1, 0.1, 0.1]])
>>>>> 
>>>>> The most simple code:
>>>>> nv[ids] += n
>>>>> does not work, I know about this. For 1D arrays I use
>>>>> numpy.bincount(...) function. But this function does not work for 2D arrays.
>>>>> 
>>>>> So, my question. What is the best way calculate accumulation sum for 2D
>>>>> arrays using indirect indexes?
>>>>> 
>>>>> Sincerely,
>>>>> Alexander
>>>>> 
>>>>> _______________________________________________
>>>>> SciPy-User mailing list
>>>>> SciPy-User@scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>> 
>>>> 
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User@scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>> 
>>> 
>>> 
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User@scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>> 
>> 
>> 
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> 
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user



More information about the SciPy-User mailing list