[SciPy-user] efficiently importing ascii data

Scott Ransom sransom at nrao.edu
Sun Nov 13 11:05:34 CST 2005


Hmmm.  Your results with TableIO seem very strange.  I use it
all the time and it works like a charm.

For a single column of values, you should probably use
something like the following:

myarr = TableIO.readColumns("myfile.txt", "#")[0]

And that will give you the column of numbers as a single 1-D
array (if you omit the [0] at the end, you will get a 1 element
list where the element is your array -- if there are more
columns, each list element is another column).

Scott


On Sun, Nov 13, 2005 at 10:59:36AM -0500, Darren Dale wrote:
> Hi Alan,
> 
> On Saturday 12 November 2005 5:37 pm, Alan G Isaac wrote:
> > On Thu, 10 Nov 2005, Darren Dale apparently wrote:
> > > I'm reading arrays of data from an ascii file and
> > > converting to appropriate numerical types. The data files
> > > can get pretty big, I was wondering if someone here might
> > > have a suggestion on how to speed things up.
> >
> > This is a common request on the SciPy Users list.
> 
> Now that you mention it, I think I may have asked once before.
> 
> > I asked Mike Miller to consider releasing TableIO
> > http://php.iupui.edu/~mmiller3/python/
> > under a more Pythonic license so that SciPy could use it.
> > He initially sounded willing, but he never actually sent
> > a message releasing the code under another license.  Nor did
> > he say that he was ultimately unwilling to do so.
> > If you can work with GPL'd code, you might try TableIO.
> > I'll bcc: him on this to see if he has decided.
> 
> Thank you for the suggestion. I looked at TableIO, but I havent been able to 
> get it working properly. I tried to read a file that had '1e7,' repeated a 
> million times, and it gave me an array that looked like
> 
> array([[ 10000000.],
>        [        0.],
>        [ 10000000.],
>        [        0.],
>        [ 10000000.],
>        [        0.],
>        [ 10000000.],
>        [        0.],
>        [ 10000000.],
>        [        0.],
>        [ 10000000.],
>        [        0.],
>        [ 10000000.],
>        [        0.],
>        [ 10000000.],
>        [        0.],
>        [ 10000000.],
>        [        0.],
>        [ 10000000.],
>        [        0.]])
> 
> I am considering using scipy's fromfile function, which gives a big speed 
> boost over io.read_array, but I don't understand what this docstring is 
> trying to tell me:
> 
>     WARNING: This function should be used sparingly, as it is not
>     a robust method of persistence.  But it can be useful to
>     read in simply-formatted or binary data quickly.
> 
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.net
> http://www.scipy.net/mailman/listinfo/scipy-user

-- 
-- 
Scott M. Ransom            Address:  NRAO
Phone:  (434) 296-0320               520 Edgemont Rd.
email:  sransom at nrao.edu             Charlottesville, VA 22903 USA
GPG Fingerprint: 06A9 9553 78BE 16DB 407B  FFCA 9BFA B6FF FFD3 2989



More information about the SciPy-user mailing list