# [Numpy-discussion] summing over more than one axis

josef.pktd@gmai... josef.pktd@gmai...
Thu Aug 19 16:20:03 CDT 2010

```On Thu, Aug 19, 2010 at 4:03 PM, John Salvatier
<jsalvati@u.washington.edu> wrote:
> Precise in what sense? Numerical accuracy? If so, why is that?

I don't remember where I ran into this example, maybe integer
NIST ANOVA test cases have some nasty badly scaled variables

but I have problems creating one, difference in 10th or higher digit

>>> a = 1000000*np.random.randn(10000,1000)
>>> a.sum()
-820034796.05545747
>>> np.sort(a.ravel())[::-1].sum()
-820034795.87886333
>>> np.sort(a.ravel()).sum()
-820034795.88172638
>>> np.sort(a,0)[::-1].sum()
-820034795.82333243
>>> np.sort(a,1)[::-1].sum()
-820034796.05559027
>>> a.sum(-1).sum(-1)
-820034796.05551744
>>> np.sort(a,1)[::-1].sum(-1).sum(-1)
-820034796.05543578
>>> np.sort(a,0)[::-1].sum(-1).sum(-1)
-820034796.05590343
>>> np.sort(a,1).sum(-1).sum(-1)
-820034796.05544424
>>> am = a.mean()
>>> am*a.size + np.sort(a-am,1).sum(-1).sum(-1)
-820034796.05554879
>>> a.size * np.sort(a,1).mean(-1).mean(-1)
-820034796.05544722

but I'm not able to get worse than 10th or 11th decimal in some random
generated examples with size 10000x1000

Josef

>
> On Thu, Aug 19, 2010 at 12:13 PM, <josef.pktd@gmail.com> wrote:
>>
>> On Thu, Aug 19, 2010 at 11:29 AM, Joe Harrington <jh@physics.ucf.edu>
>> wrote:
>> > On Thu, 19 Aug 2010 09:06:32 -0500, G?khan Sever <gokhansever@gmail.com>
>> > wrote:
>> >
>> >>On Thu, Aug 19, 2010 at 9:01 AM, greg whittier <gregwh@gmail.com> wrote:
>> >>
>> >>> I frequently deal with 3D data and would like to sum (or find the
>> >>> mean, etc.) over the last two axes.  I.e. sum a[i,j,k] over j and k.
>> >>> I find using .sum() really convenient for 2d arrays but end up
>> >>> reshaping 2d arrays to do this.  I know there has to be a more
>> >>> convenient way.  Here's what I'm doing
>> >>>
>> >>> a = np.arange(27).reshape(3,3,3)
>> >>>
>> >>> # sum over axis 1 and 2
>> >>> result = a.reshape((a.shape[0], a.shape[1]*a.shape[2])).sum(axis=1)
>> >>>
>> >>> Is there a cleaner way to do this?  I'm sure I'm missing something
>> >>> obvious.
>> >>>
>> >>> Thanks,
>> >>> Greg
>> >>>
>> >>
>> >>Using two sums
>> >>
>> >>np.sum(np.sum(a, axis=-2), axis=1)
>> >
>> > Be careful.  This works for sums, but not for operations like median;
>> > the median of the row medians may not be the global median.  So, you
>> > need to do the medians in one step.  I'm not aware of a method cleaner
>> > than manually reshaping first.  There may also be speed reasons to do
>> > things in one step.  But, two steps may look cleaner in code.
>>
>> I think, two .sums() are the most accurate, if precision matters. One
>> big summation is often not very precise.
>>
>> Josef
>>
>>
>> >
>> > --jh--
>> > _______________________________________________
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion@scipy.org
>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
```