[Numpy-discussion] How to read data from text files fast?

Chris Barker Chris.Barker at noaa.gov
Fri Jul 9 09:44:12 CDT 2004


Thanks for your feedback.

Bruce Southey wrote:
> While I am not really following your thread, I just wanted to comment that the
> Python Cookbook (at least the printed version) has some ways to count lines in a
> file - assuming that the number of lines provides the size.

The number of lines does not necessarily provide the size. In the 
general case, it doesn't at all. My whole goal here is the general case: 
being able to read a bunch of numbers out of any format of text file. 
This can be used as part of a parser for many file formats. If I was 
shooting for just one format, this would be easier, but not general 
purpose. Now that I have this, I can write a number of file format 
parsers in python with improved performance and easier syntax.

Under Unix (but not
> windows),

I am aiming for a portable solution.

> Alternatively if sufficient memory is available, storing the file in memory
> (during the counting of elements) should always be faster than reading it a
> second time from the hard disk.

The primary reason to scan the file ahead of time to count the elements 
is to save the memory of duplicate copies of data. The other reason is 
to make memory management easier, but since I've already solved that 
problem, I'm done.


Christopher Barker, Ph.D.
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov

More information about the Numpy-discussion mailing list