# [SciPy-user] Dealing with Large Data Sets

lechtlr lechtlr@yahoo....
Sat May 10 09:14:18 CDT 2008

```I try to create an array called 'results' as provided in an example below.  Is there a way to do this operation more efficiently when the number of 'data_x' array gets larger ?  Also, I am looking for pointers to eliminate intermediate 'data_x' arrays, while creating 'results' in the following procedure.

Thanks,
Lex

from numpy import *
from numpy.random import *

# what is the best way to create an array named 'results' below
# when number of 'data_x' (i.e., x = 1, 2.....1000) is large.
# Also nrows and ncolumns can go upto 10000

nrows = 5
ncolumns = 10

data_1 = zeros([nrows, ncolumns], 'd')
data_2 = zeros([nrows, ncolumns], 'd')
data_3 = zeros([nrows, ncolumns], 'd')

# to store squared sum of each column from the arrays above
results = zeros([3,ncolumns], 'd')

# loop to store raw data from a numerical operation;
# rand() is given as an example here
for i in range(nrows):
for j in range(ncolumns):
data_1[i,j] = rand()
data_2[i,j] = rand()
data_3[i,j] = rand()

# store squared sum of each column from data_x
for k in range(ncolumns):
results[0,k] = dot(data_1[:,k], data_1[:,k])
results[1,k] = dot(data_2[:,k], data_2[:,k])
results[2,k] = dot(data_3[:,k], data_3[:,k])

print results

---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/scipy-user/attachments/20080510/f3dadc07/attachment.html
```

More information about the SciPy-user mailing list