[Numpy-discussion] Optimized sum of squares
Sun Oct 18 12:37:55 CDT 2009
On Sun, Oct 18, 2009 at 12:06 PM, Skipper Seabold <email@example.com> wrote:
> On Sun, Oct 18, 2009 at 8:09 AM, Gael Varoquaux
> <firstname.lastname@example.org> wrote:
>> On Sun, Oct 18, 2009 at 09:06:15PM +1100, Gary Ruben wrote:
>>> Hi Gaël,
>>> If you've got a 1D array/vector called "a", I think the normal idiom is
>>> For the more general case, I think
>>> np.tensordot(a, a, axes=something_else)
>>> should do it, where you should be able to figure out something_else for
>>> your particular case.
>> Ha, yes. Good point about the tensordot trick.
>> Thank you
> I'm curious about this as I use ss, which is just np.sum(a*a, axis),
> in statsmodels and didn't much think about it.
> There is
> import numpy as np
> from scipy.stats import ss
> a = np.ones(5000)
> timeit ss(a)
> 10000 loops, best of 3: 21.5 µs per loop
> timeit np.add.reduce(a*a)
> 100000 loops, best of 3: 15 µs per loop
> timeit np.dot(a,a)
> 100000 loops, best of 3: 5.38 µs per loop
> Do the number of loops matter in the timings and is dot always faster
> even without the blas dot?
David's reply once was that it depends on ATLAS and the version of lapack/blas.
I usually switched to using dot for 1d. Using tensordot looks to
complicated for me, to figure out the axes when I quickly want a sum of squares.
I never tried the timing of tensordot for 2d arrays, especially for
axis=0 for a
c ordered array. If it's faster, this could be useful to rewrite stats.ss.
I don't remember that np.add.reduce is much faster than np.sum. This might be
the additional call overhead from using another function in between.
> NumPy-Discussion mailing list
More information about the NumPy-Discussion