[Numpy-discussion] Vectorizing code, for loops, and all that

Albert Strasheim 13640887 at sun.ac.za
Mon Oct 2 19:42:24 CDT 2006

Hello all

> -----Original Message-----
> From: numpy-discussion-bounces at lists.sourceforge.net [mailto:numpy-
> discussion-bounces at lists.sourceforge.net] On Behalf Of Travis Oliphant
> Sent: 03 October 2006 02:32
> To: Discussion of Numerical Python
> Subject: Re: [Numpy-discussion] Vectorizing code, for loops, and all that
> Travis Oliphant wrote:
> >
> >I suspect I know why, although the difference seems rather large.
> >
> [snip]
> >I'm surprised the overhead of adjusting pointers is so high, but then
> >again you are probably getting a lot of cache misses in the first case
> >so there is more to it than that, the loops may run more slowly too.
> >
> >
> I'm personally bothered that this example runs so much more slowly.  I
> don't think it should.  Perhaps it is unavoidable because of the
> memory-layout issues.  It is just hard to believe that the overhead for
> calling into the loop and adjusting the pointers is so much higher.

Firstly, thanks to Tim... I'll try his functions tomorrow.

Meanwhile, I can confirm that the NOBUFFER_UFUNCLOOP case in
PyUFunc_GenericFunction is getting exercised in the slower case. Here's some
info on what's happening, courtesy of Rational Quantify:

while (loop->index < loop->size) {
  for (i=0; i<self->nargs; i++)
    loop->bufptr[i] = loop->iters[i]->dataptr; [1]

  loop->function((char **)loop->bufptr, &(loop->bufcnt),
                 loop->steps, loop->funcdata); [2]

  for (i=0; i<self->nargs; i++) {
      PyArray_ITER_NEXT(loop->iters[i]); [3]

[1] 12.97% of function time
[2] 8.65% of functiont ime
[3] 62.14% of function time

If statistics from elsewhere in the code would be helpful, let me know, and
I'll see if I can convince Quantify to cough it up.

> <snip>

Thanks for all code and suggestions.



More information about the Numpy-discussion mailing list