[Numpy-discussion] odd performance of sum?

eat e.antero.tammi@gmail....
Thu Feb 10 14:29:10 CST 2011


Hi Robert,

On Thu, Feb 10, 2011 at 8:16 PM, Robert Kern <robert.kern@gmail.com> wrote:

> On Thu, Feb 10, 2011 at 11:53, eat <e.antero.tammi@gmail.com> wrote:
> > Thanks Chuck,
> >
> > for replying. But don't you still feel very odd that dot outperforms sum
> in
> > your machine? Just to get it simply; why sum can't outperform dot?
> Whatever
> > architecture (computer, cache) you have, it don't make any sense at all
> that
> > when performing significantly less instructions, you'll reach to spend
> more
> > time ;-).
>
> These days, the determining factor is less often instruction count
> than memory latency, and the optimized BLAS implementations of dot()
> heavily optimize the memory access patterns.

Can't we have this as well with simple sum?

> Additionally, the number
> of instructions in your dot() probably isn't that many more than the
> sum(). The sum() is pretty dumb

But does it need to be?

> and just does a linear accumulation
> using the ufunc reduce mechanism, so (m*n-1) ADDs plus quite a few
> instructions for traversing the array in a generic manner. With fused
> multiply-adds, being able to assume contiguous data and ignore the
> numpy iterator overhead, and applying divide-and-conquer kernels to
> arrange sums, the optimized dot() implementations could have a
> comparable instruction count.

Couldn't sum benefit with similar logic?

> If you were willing to spend that amount of developer time and code
> complexity to make platform-specific backends to sum()

Actually I would, but I'm not competent at all in that detailed level (:,
But I'm willing to spend more on my own time for example for testing,
debugging, analysing various improvements and suggestions if such emerge.

> , you could make
> it go really fast, too. Typically, it's not all that important to make
> it worthwhile, though. One thing that might be worthwhile is to make
> implementations of sum() and cumsum() that avoid the ufunc machinery
> and do their iterations more quickly, at least for some common
> combinations of dtype and contiguity.
>
Well I'm allready perplexd before reaching that 'ufunc machinery', it's
actually anyway trivial (for us more mortal ;-) to figure out what's
happening with sum on fromnumeric.py!


Regards,
eat

>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>   -- Umberto Eco
>  _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20110210/39ab6d08/attachment.html 


More information about the NumPy-Discussion mailing list