[Numpy-discussion] Numpy 2D array from a list error

davew0000 davejwood@gmail....
Wed Sep 23 08:42:43 CDT 2009


I've got a fairly large (but not huge, 58mb) tab seperated text file, with
approximately 200 columns and 56k rows of numbers and strings. 

Here's a snippet of my code to create a numpy matrix from the data file... 


data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) 
data = array(data) 

data = array(data)
It causes the following error: 

> ValueError: setting an array element with a sequence 

If I take the 1st 40,000 lines of the file, it works fine. 
If I take the last 40,000 lines of the file, it also works fine, so it isn't
a problem with the file. 

I've found a few other posts complaining of the same problem, but none of
their fixes work. 

It seems like a memory problem to me. This was reinforced when I tried to
break the dataset into 3 chunks and stack the resulting arrays - I got an
error message saying "memory error". 
I don't really understand why reading in this 57mb txt file is taking up
~2gb's of RAM.

Any advice? Thanks in advance 

View this message in context: http://www.nabble.com/Numpy-2D-array-from-a-list-error-tp25531145p25531145.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.

More information about the NumPy-Discussion mailing list