[SciPy-User] Accumulation sum using indirect indexes
Travis Oliphant
travis@continuum...
Thu Feb 2 13:11:55 CST 2012
On Feb 2, 2012, at 1:01 PM, josef.pktd@gmail.com wrote:
> On Thu, Feb 2, 2012 at 1:16 PM, Warren Weckesser
> <warren.weckesser@enthought.com> wrote:
>>
>>
>> On Wed, Feb 1, 2012 at 10:34 AM, Alexander Kalinin <alec.kalinin@gmail.com>
>> wrote:
>>>
>>> Yes, but for large data sets loops is quite slow. I have tried Pandas
>>> groupby.sum() and it works faster.
>>>
>>
>>
>> Pandas is probably the correct tool to use for this, but it will be nice
>> when numpy has a native "group-by" capability.
>>
>> For what its worth (had to scratch the itch, so to speak), the attached
>> script provides a "pure numpy" implementation without a python loop. The
>> output of the script is
>>
>> In [53]: run pseudo_group_by.py
>> Label Data
>> 20 [1 2 3]
>> 20 [1 2 4]
>> 10 [3 3 1]
>> 0 [5 0 0]
>> 20 [1 9 0]
>> 10 [2 3 4]
>> 20 [9 9 1]
>>
>> Label Num. Sum
>> 0 1 [5 0 0]
>> 10 2 [5 6 5]
>> 20 4 [12 22 8]
>>
>>
>> A drawback of the method is that it will make a reordered copy of the data.
>> I haven't compared the performance to pandas.
>
> nice use of reduceat, I found it recently in an example but haven't used it yet.
> It looks convenient if labels are presorted and numeric.
Reduceat is pretty convenient, but it's limited right now because you have to have contiguous fence-posts for your reductions. There is a NEP with the group-by nep to make a reduce that takes in arbitrary index-ranges for reductions.
-Travis
>
> Josef
>
>>
>> Warren
>>
>>
>>>
>>>
>>> 2012/2/1 Frédéric Bastien <nouiz@nouiz.org>
>>>>
>>>> It will be slow, but you can make a python loop.
>>>>
>>>> Fred
>>>>
>>>> On Jan 31, 2012 3:34 PM, "Alexander Kalinin" <alec.kalinin@gmail.com>
>>>> wrote:
>>>>>
>>>>> Hello!
>>>>>
>>>>> I use SciPy in computer graphics applications. My task is to calculate
>>>>> vertex normals by averaging faces normals. In other words I want to
>>>>> accumulate vectors with the same ids. For example,
>>>>>
>>>>> ids = numpy.array([0, 1, 1, 2])
>>>>> n = numpy.array([ [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], [0.1, 0.1, 0.1],
>>>>> [0.1, 0.1 0.1] ])
>>>>>
>>>>> I need result:
>>>>> nv = ([ [0.1, 0.1, 0.1], [0.2, 0.2, 0.2], [0.1, 0.1, 0.1]])
>>>>>
>>>>> The most simple code:
>>>>> nv[ids] += n
>>>>> does not work, I know about this. For 1D arrays I use
>>>>> numpy.bincount(...) function. But this function does not work for 2D arrays.
>>>>>
>>>>> So, my question. What is the best way calculate accumulation sum for 2D
>>>>> arrays using indirect indexes?
>>>>>
>>>>> Sincerely,
>>>>> Alexander
>>>>>
>>>>> _______________________________________________
>>>>> SciPy-User mailing list
>>>>> SciPy-User@scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>
>>>>
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User@scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>
>>>
>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User@scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
More information about the SciPy-User
mailing list