[SciPy-User] IO of large ASCII table data

Éric Depagne edepagne@lcogt....
Tue Aug 17 12:53:07 CDT 2010


Le mardi 17 août 2010 10:41:26, Dan Lussier a écrit :
> I am looking to read in large (many million rows) ASCII space
> separated tables into numpy arrays.
> 
> In the past I have heard of people using Miller's TableIO to do this
> but was wondering if a similarly fast method has been more recently
> integrated into scipy/numpy?
> 
> In consulting the documentation the most likely candidate is
> numpy.genfromtext(...).  Is this function pure python or does it rely
> on a C extension as was the case with Miller's TableIO?
> 
> Any advice here would be great as my application could get seriously
> bogged down (both time and memory) in reading these files into arrays
> if I get onto the wrong track.
> 
> Thanks.
There is the numpy.loadtxt() method that can also read data from file.
I use it to read large datasets. Considering its speed, here are numbers I 
typically get. To extract 2.5 million lines and 10 columns it needs ~3mn.

Éric.

-- 
Un clavier azerty en vaut deux
----------------------------------------------------------
Éric Depagne                            edepagne@lcogt.net
Las Cumbres Observatory 
6740 Cortona Dr
Goleta CA, 93117
----------------------------------------------------------


More information about the SciPy-User mailing list