[SciPy-User] Accumulation sum using indirect indexes
> Yes, but for large data sets loops is quite slow. I have tried Pandas
> groupby.sum() and it works faster.
Pandas is probably the correct tool to use for this, but it will be nice
when numpy has a native "group-by" capability.
For what its worth (had to scratch the itch, so to speak), the attached
script provides a "pure numpy" implementation without a python loop. The
output of the script is
In [53]: run pseudo_group_by.py
Label Data
20 [1 2 3]
20 [1 2 4]
10 [3 3 1]
0 [5 0 0]
20 [1 9 0]
10 [2 3 4]
20 [9 9 1]
Label Num. Sum
0 1 [5 0 0]
10 2 [5 6 5]
20 4 [12 22 8]
A drawback of the method is that it will make a reordered copy of the
data. I haven't compared the performance to pandas.
Warren
> 2012/2/1 Frédéric Bastien <nouiz@nouiz.org>
>> It will be slow, but you can make a python loop.
>> Fred
>>> Hello!
>>> I use SciPy in computer graphics applications. My task is to calculate
>>> vertex normals by averaging faces normals. In other words I want to
>>> accumulate vectors with the same ids. For example,
>>> ids = numpy.array([0, 1, 1, 2])
>>> n = numpy.array([ [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], [0.1, 0.1, 0.1],
>>> [0.1, 0.1 0.1] ])
>>> I need result:
>>> nv = ([ [0.1, 0.1, 0.1], [0.2, 0.2, 0.2], [0.1, 0.1, 0.1]])
>>> The most simple code:
>>> nv[ids] += n
>>> does not work, I know about this. For 1D arrays I use
>>> numpy.bincount(...) function. But this function does not work for 2D arrays.
>>> So, my question. What is the best way calculate accumulation sum for 2D
>>> arrays using indirect indexes?
>>> Sincerely,
>>> Alexander
