[Numpy-discussion] Vectorizing code, for loops, and all that

Travis Oliphant oliphant at ee.byu.edu
Thu Oct 5 08:34:45 CDT 2006


Travis Oliphant wrote:

>Albert Strasheim wrote:
>  
>
>>>>[1] 12.97% of function time
>>>>[2] 8.65% of functiont ime
>>>>[3] 62.14% of function time
>>>>
>>>>If statistics from elsewhere in the code would be helpful, let me 
>>>>know,
>>>>      
>>>>        
>>>>
>>>and
>>>    
>>>      
>>>
>>>>I'll see if I can convince Quantify to cough it up.
>>>>
>>>>      
>>>>        
>>>>
>>>Please run the same test but using
>>>
>>>x1 = N.random.rand(39,2000)
>>>x2 = N.random.rand(39,64,1)
>>>
>>>z1 = x1[:,N.newaxis,:] - x2
>>>    
>>>      
>>>
>>Very similar results to what I had previously:
>>
>>[1] 10.88%
>>[2] 7.25%
>>[3] 68.25%
>>
>>  
>>    
>>
>Thanks,
>
>I've got some ideas about how to speed this up by eliminating some of 
>the unnecessary calculations  going on outside of the function loop, but 
>there will still be some speed issues depending on how the array is 
>traversed once you get above a certain size.   I'm not sure there anyway 
>around that, ultimately, due to memory access being slow on most hardware. 
>  
>

Well, I tried out my ideas and didn't get much improvement (8-10%).  
Then, I finally realized more fully that the slowness was due to the 
loop taking place over an axis which had a very large stride so that the 
memory access was taking a long time. 

Thus, instead of picking the loop axis to correspond to the axis with 
the longest dimension, I've picked the loop axis to be one with the 
smallest sum of strides.

In this particular example, the speed-up is about 6-7 times...

-Travis





More information about the Numpy-discussion mailing list