[Numpy-discussion] fromfile() for reading text (one more time!)
Tue Jan 5 11:32:01 CST 2010
> On Mon, Jan 4, 2010 at 10:39 PM, <firstname.lastname@example.org> wrote:
>> I rather like the R command(s) for reading text files
> Aren't the newly improved
> and friends indented to handle all this
Yes, they are, and they are great, but not really all that fast. If
you've got big complicated tables of data to read, then genfromtxt is
the way to go -- it's a great tool. However, for the simple stuff, it's
not really optimized. I also find I have to read a lot of text files
that aren't tables of data, but rather an odd mix of stuff, but still a
lot of reading lots of numbers from a file. As far as I can tell,
genfromtxt and loadtxt can only load the entire file as a table (a very
common situation, of course).
Paul Ivanov wrote:
> Just a potshot, but have you tried np.loadtxt?
> I find it pretty fast.
I guess I should have posted timings in the first place:
In : timeit timing.time_genfromtxt()
10 loops, best of 3: 216 ms per loop
In : timeit timing.time_loadtxt()
10 loops, best of 3: 166 ms per loop
In : timeit timing.time_fromfile()
10 loops, best of 3: 47.1 ms per loop
(40,000 doubles from a space-delimted text file)
so fromfile() is 3.5 times as fast as loadtxt and 4.5 times as fast as
genfromtxt. That does make a difference for me -- the user waiting 4
seconds, rather than one second to load a file matters.
I suppose another option might be to see if I can optimize the inner
scanning function of genfromtxt with Cython or C, but I'm not sure
that's possible, as it's really very flexible, and re-writing all of
that without Python would be really painful!
Christopher Barker, Ph.D.
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
More information about the NumPy-Discussion