[Numpy-discussion] odd performance of sum?
Thu Feb 10 14:38:52 CST 2011
On Thu, Feb 10, 2011 at 8:31 PM, Pauli Virtanen <email@example.com> wrote:
> Thu, 10 Feb 2011 12:16:12 -0600, Robert Kern wrote:
> > One thing that might be worthwhile is to make
> > implementations of sum() and cumsum() that avoid the ufunc machinery and
> > do their iterations more quickly, at least for some common combinations
> > of dtype and contiguity.
> I wonder what is the balance between the iterator overhead and the time
> taken in the reduction inner loop. This should be straightforward to
> Apparently, some overhead decreased with the new iterators, since current
> Numpy master outperforms 1.5.1 by a factor of 2 for this benchmark:
> In : %timeit M.sum(1) # Numpy 1.5.1
> 10 loops, best of 3: 85 ms per loop
> In : %timeit M.sum(1) # Numpy master
> 10 loops, best of 3: 49.5 ms per loop
> I don't think this is explainable by the new memory layout optimizations,
> since M is C-contiguous.
> Perhaps there would be room for more optimization, even within the ufunc
I hope so. Please suggest if there's anything that I can do to further
advance this. (My C skills are allready bit rusty, but at any higher level
I'll try my best to contribute).
> Pauli Virtanen
> NumPy-Discussion mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion