[Numpy-discussion] ***[Possible UCE]*** Re: Vectorizing code, for loops, and all that

Travis Oliphant oliphant at ee.byu.edu
Mon Oct 2 20:50:35 CDT 2006


Albert Strasheim wrote:

>Hello all
>
>  
>
>>-----Original Message-----
>>From: numpy-discussion-bounces at lists.sourceforge.net [mailto:numpy-
>>discussion-bounces at lists.sourceforge.net] On Behalf Of Travis Oliphant
>>Sent: 03 October 2006 02:32
>>To: Discussion of Numerical Python
>>Subject: Re: [Numpy-discussion] Vectorizing code, for loops, and all that
>>
>>Travis Oliphant wrote:
>>
>>    
>>
>>>I suspect I know why, although the difference seems rather large.
>>>
>>>      
>>>
>>[snip]
>>
>>    
>>
>>>I'm surprised the overhead of adjusting pointers is so high, but then
>>>again you are probably getting a lot of cache misses in the first case
>>>so there is more to it than that, the loops may run more slowly too.
>>>
>>>
>>>      
>>>
>>I'm personally bothered that this example runs so much more slowly.  I
>>don't think it should.  Perhaps it is unavoidable because of the
>>memory-layout issues.  It is just hard to believe that the overhead for
>>calling into the loop and adjusting the pointers is so much higher.
>>    
>>
>
>Firstly, thanks to Tim... I'll try his functions tomorrow.
>
>Meanwhile, I can confirm that the NOBUFFER_UFUNCLOOP case in
>PyUFunc_GenericFunction is getting exercised in the slower case. Here's some
>info on what's happening, courtesy of Rational Quantify:
>
>case NOBUFFER_UFUNCLOOP:
>while (loop->index < loop->size) {
>  for (i=0; i<self->nargs; i++)
>    loop->bufptr[i] = loop->iters[i]->dataptr; [1]
>
>  loop->function((char **)loop->bufptr, &(loop->bufcnt),
>                 loop->steps, loop->funcdata); [2]
>  UFUNC_CHECK_ERROR(loop);
>
>  for (i=0; i<self->nargs; i++) {
>      PyArray_ITER_NEXT(loop->iters[i]); [3]
>  }
>  loop->index++;
>}
>break;
>
>[1] 12.97% of function time
>[2] 8.65% of functiont ime
>[3] 62.14% of function time
>
>If statistics from elsewhere in the code would be helpful, let me know, and
>I'll see if I can convince Quantify to cough it up.
>
>  
>
Please run the same test but using

x1 = N.random.rand(39,2000)
x2 = N.random.rand(39,64,1)

z1 = x1[:,N.newaxis,:] - x2


Thanks,

-Travis









More information about the Numpy-discussion mailing list