[Numpy-discussion] loadtxt slow
Sun Mar 1 18:51:00 CST 2009
On Sun, Mar 1, 2009 at 11:29 AM, Michael Gilbert
> On Sun, 1 Mar 2009 16:12:14 -0500 Gideon Simpson wrote:
>> So I have some data sets of about 160000 floating point numbers stored
>> in text files. I find that loadtxt is rather slow. Is this to be
>> expected? Would it be faster if it were loading binary data?
> i have run into this as well. loadtxt uses a python list to allocate
> memory for the data it reads in, so once you get to about 1/4th of your
> available memory, it will start allocating the updated list (every
> time it reads a new value from your data file) in swap instead of main
> memory, which is rediculously slow (in fact it causes my system to be
> quite unresponsive and a jumpy cursor). i have rewritten loadtxt to be
> smarter about allocating memory, but it is slower overall and doesn't
> support all of the original arguments/options (yet). i have some
> ideas to make it smarter/more efficient, but have not had the time
> to work on it recently.
> i will send the current version to the list tomorrow when i have access
> to the system that it is on.
> best wishes,
> Numpy-discussion mailing list
to address the slowness, i use wrappers around savetxt/loadtxt that
save/load a .npy file
along with/instead of the .txt file. -- and the loadtxt wrapper checks
if the .npy is up-to-date.
of course it's still slow the first time. i look forward to your speedups.
More information about the Numpy-discussion