[SciPy-user] Dealing with Large Data Sets
Sat May 10 09:14:18 CDT 2008
I try to create an array called 'results' as provided in an example below. Is there a way to do this operation more efficiently when the number of 'data_x' array gets larger ? Also, I am looking for pointers to eliminate intermediate 'data_x' arrays, while creating 'results' in the following procedure.
from numpy import *
from numpy.random import *
# what is the best way to create an array named 'results' below
# when number of 'data_x' (i.e., x = 1, 2.....1000) is large.
# Also nrows and ncolumns can go upto 10000
nrows = 5
ncolumns = 10
data_1 = zeros([nrows, ncolumns], 'd')
data_2 = zeros([nrows, ncolumns], 'd')
data_3 = zeros([nrows, ncolumns], 'd')
# to store squared sum of each column from the arrays above
results = zeros([3,ncolumns], 'd')
# loop to store raw data from a numerical operation;
# rand() is given as an example here
for i in range(nrows):
for j in range(ncolumns):
data_1[i,j] = rand()
data_2[i,j] = rand()
data_3[i,j] = rand()
# store squared sum of each column from data_x
for k in range(ncolumns):
results[0,k] = dot(data_1[:,k], data_1[:,k])
results[1,k] = dot(data_2[:,k], data_2[:,k])
results[2,k] = dot(data_3[:,k], data_3[:,k])
Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the SciPy-user