[Numpy-discussion] loadtxt slow
Sun Mar 1 13:29:54 CST 2009
On Sun, 1 Mar 2009 16:12:14 -0500 Gideon Simpson wrote:
> So I have some data sets of about 160000 floating point numbers stored
> in text files. I find that loadtxt is rather slow. Is this to be
> expected? Would it be faster if it were loading binary data?
i have run into this as well. loadtxt uses a python list to allocate
memory for the data it reads in, so once you get to about 1/4th of your
available memory, it will start allocating the updated list (every
time it reads a new value from your data file) in swap instead of main
memory, which is rediculously slow (in fact it causes my system to be
quite unresponsive and a jumpy cursor). i have rewritten loadtxt to be
smarter about allocating memory, but it is slower overall and doesn't
support all of the original arguments/options (yet). i have some
ideas to make it smarter/more efficient, but have not had the time
to work on it recently.
i will send the current version to the list tomorrow when i have access
to the system that it is on.
More information about the Numpy-discussion