[SciPy-User] MemoryError with tsfromtxt

Bruce Southey bsouthey@gmail....
Thu Sep 9 08:29:32 CDT 2010


  On 09/09/2010 06:19 AM, Pierre GM wrote:
> On Sep 9, 2010, at 1:11 PM, Timmie wrote:
>
>>> You must have quite a huge file... Note that it's not a scikits.timeseries
>> pb, just a standard numpy one.
>> The file has 298 MB.
>>
>> 5370772 records (rows); data in minutely frequency.
>>
>>>> And what could I do to mitigate it?
>>> Cut the file in pieces ?
>> and then concatenate the timeseries?
> That's the idea. Could you cut it day by day, or week by week, or even month by month to reduce the load ?
> The issue is that genfromtxt has to keep a lot of information in memory (a list of values, a list of masks) before creating the array, and you're overloading Python's capacity to deal with it...
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
You could buy more memory because 5.4 million rows can add up very 
quickly with many columns. Note that you also need contiguous memory 
available.

If you know the format of the input, then use something else like loadtxt.
If you know the size and format then you can slowly iterate over the 
file and input the values directly into an empty array or use Chris's 
code to append to an array.

Bruce


More information about the SciPy-User mailing list