[Numpy-discussion] Vectorizing code, for loops, and all that
Travis Oliphant
oliphant at ee.byu.edu
Thu Oct 5 08:34:45 CDT 2006
Travis Oliphant wrote:
>Albert Strasheim wrote:
>
>
>>>>[1] 12.97% of function time
>>>>[2] 8.65% of functiont ime
>>>>[3] 62.14% of function time
>>>>
>>>>If statistics from elsewhere in the code would be helpful, let me
>>>>know,
>>>>
>>>>
>>>>
>>>and
>>>
>>>
>>>
>>>>I'll see if I can convince Quantify to cough it up.
>>>>
>>>>
>>>>
>>>>
>>>Please run the same test but using
>>>
>>>x1 = N.random.rand(39,2000)
>>>x2 = N.random.rand(39,64,1)
>>>
>>>z1 = x1[:,N.newaxis,:] - x2
>>>
>>>
>>>
>>Very similar results to what I had previously:
>>
>>[1] 10.88%
>>[2] 7.25%
>>[3] 68.25%
>>
>>
>>
>>
>Thanks,
>
>I've got some ideas about how to speed this up by eliminating some of
>the unnecessary calculations going on outside of the function loop, but
>there will still be some speed issues depending on how the array is
>traversed once you get above a certain size. I'm not sure there anyway
>around that, ultimately, due to memory access being slow on most hardware.
>
>
Well, I tried out my ideas and didn't get much improvement (8-10%).
Then, I finally realized more fully that the slowness was due to the
loop taking place over an axis which had a very large stride so that the
memory access was taking a long time.
Thus, instead of picking the loop axis to correspond to the axis with
the longest dimension, I've picked the loop axis to be one with the
smallest sum of strides.
In this particular example, the speed-up is about 6-7 times...
-Travis
More information about the Numpy-discussion
mailing list