[Numpy-discussion] extremely slow array indexing?

Tim Hochberg tim.hochberg at ieee.org
Thu Nov 30 12:20:36 CST 2006


Fang Fang wrote:
> Thanks for your reply. The simplified code is as follows. It takes 7 
> seconds to process 1000 rows, which is tolerable, but I wonder why it 
> takes so long. Isn't vectorized operation supposed to run very quickly.
>
> from numpy import *
>
> componentcount = 300000
> currSum = zeros(componentcount)
> row = zeros(componentcount) #current row
> rowcount = 50000
> for i in range(1,rowcount):
>     row[:] = 1
>     currSum = currSum + row
>

In this case, you can save a significant chunk of time (about 30% on my 
machine) by replacing the last line with:
    currSum += row

I suspect that the bulk of the remaining time is being chewed up moving 
things in and out of the cache since your arrays are large. There are 
some things you could try to alleviate that, but I don't know that I'd 
worry about it as I suspect that your overall time in your original 
problem is going to be dominated by pulling stuff out out of the 
dictionaries and populating row.

-tim



More information about the Numpy-discussion mailing list