[Numpy-discussion] Optimizing reduction loops (sum(), prod(), et al.)

Charles R Harris charlesr.harris@gmail....
Wed Jul 8 18:24:52 CDT 2009

On Wed, Jul 8, 2009 at 5:02 PM, Pauli Virtanen <pav+sp@iki.fi<pav%2Bsp@iki.fi>
> wrote:

> On 2009-07-08, Stéfan van der Walt <stefan@sun.ac.za> wrote:
> > I know very little about cache optimality, so excuse the triviality of
> > this question: Is it possible to design this loop optimally (taking
> > into account certain build-time measurable parameters), or is it the
> > kind of thing that can only be discovered by tuning at compile-time?
> > ATNumPy... scary :-)
> I'm still kind of hoping that it's possible to make some minimal
> assumptions about CPU caches in general, and have a rule that
> decides a code path that is good enough, if not optimal.
> I don't think we want to go the ATNumPy route, or even have
> tunable parameters chosen at build or compile time. (Unless, of
> course, we want to bring a monster into the world -- think about
> cross-breeding distutils with the ATLAS build system :)

Sort of the software version of the Human Fly. Sounds like next summer's
blockbuster movie.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20090708/69d52e21/attachment.html 

More information about the NumPy-Discussion mailing list