[Numpy-discussion] extremely slow array indexing?

Fang Fang fang.fang2003 at gmail.com
Thu Nov 30 11:59:13 CST 2006


Thanks for your reply. The simplified code is as follows. It takes 7 seconds
to process 1000 rows, which is tolerable, but I wonder why it takes so long.
Isn't vectorized operation supposed to run very quickly.

from numpy import *

componentcount = 300000
currSum = zeros(componentcount)
row = zeros(componentcount) #current row
rowcount = 50000
for i in range(1,rowcount):
    row[:] = 1
    currSum = currSum + row;



On 11/30/06, Robert Kern <robert.kern at gmail.com> wrote:
>
> Fang Fang wrote:
> > Hi,
> >
> > I am writing code to sort the columns according to the sum of each
> > column. The dataset is huge (50k rows x 300k cols), so i have to read
> > line by line and do the summation to avoid the out-of-memory problem.
> > But I don't know why it runs very slow, and part of the code is as
> > follows. Can anyone point out what needs to be modified to make it run
> > fast? thanks in advance!
>
> Nothing leaps out. Generally, it's difficult (or impossible) to answer
> such
> questions without running code. Can you distill the time-consuming part
> into a
> small, self-contained script with fake data that we can run?
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma
> that is made terrible by our own mad attempt to interpret it as though it
> had
> an underlying truth."
> -- Umberto Eco
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20061130/e084c6e0/attachment-0001.html 


More information about the Numpy-discussion mailing list