[Numpy-discussion] Numpy 2D array from a list error

Skipper Seabold jsseabold@gmail....
Wed Sep 23 10:43:51 CDT 2009


On Wed, Sep 23, 2009 at 9:42 AM, davew0000 <davejwood@gmail.com> wrote:
>
> Hi,
>
> I've got a fairly large (but not huge, 58mb) tab seperated text file, with
> approximately 200 columns and 56k rows of numbers and strings.
>
> Here's a snippet of my code to create a numpy matrix from the data file...
>
> ####
>
> data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines())
> data = array(data)
>
> ###
> data = array(data)
> It causes the following error:
>
>> ValueError: setting an array element with a sequence
>
> If I take the 1st 40,000 lines of the file, it works fine.
> If I take the last 40,000 lines of the file, it also works fine, so it isn't
> a problem with the file.
>
> I've found a few other posts complaining of the same problem, but none of
> their fixes work.
>
> It seems like a memory problem to me. This was reinforced when I tried to
> break the dataset into 3 chunks and stack the resulting arrays - I got an
> error message saying "memory error".
> I don't really understand why reading in this 57mb txt file is taking up
> ~2gb's of RAM.
>
> Any advice? Thanks in advance
>

Without knowing more, I wouldn't think that there's really a memory
error trying to load a 57 MB file or stacking it split into 3.  Try
using genfromtxt or loadtxt.  It should work without a problem unless
there is something funny about your file.

Skipper


More information about the NumPy-Discussion mailing list