[Numpy-discussion] odd ascii format and genfromtxt
Fri Feb 26 03:10:08 CST 2010
On Fri, Feb 26, 2010 at 4:29 PM, Warren Weckesser <
> Ralf Gommers wrote:
> > Hi all,
> > I'm trying to read in data from text files with genfromtxt, and have
> > some trouble figuring out the right combination of keywords. The
> > format is:
> > ['0\t\t4.000000000000000e+007,0.000000000000000e+000\n',
> > '\t9.860280631554179e-001,-1.902586503306264e-002\n',
> > '\t9.860280631554179e-001,-1.902586503306264e-002']
> > Note that there are two delimiters, tab and comma. Also, the first
> > line has an extra integer plus tab (this is a repeating pattern).
> The 'delimiter' keyword does not accept a list of strings. If it is a
> list, it must be a list of integers that are the field widths. In your
> case, that won't work.
> You could try fromregex:
> In : import numpy as np
> In : cat sample.raw
> 0 4.000e+007,0.00000e+000
> 123 5.0e6,100.0
> In : a = np.fromregex('sample.raw', '(.*?)\t+(.*),(.*)',
> np.dtype([('extra', 'S8'), ('x', float), ('y', float)]))
> In : a
> array([('0', 40000000.0, 0.0), ('', 0.98602805999999998, -0.019025),
> ('', 0.98602805999999998, -0.019025), ('123', 5000000.0, 100.0),
> ('', 10.1, -0.002), ('', 10.199999999999999,
> dtype=[('extra', '|S8'), ('x', '<f8'), ('y', '<f8')])
> Note that the first field of the array is a string, not an integer. The
> string will be empty in rows that did not have the initial integer. I
> don't know if that will work for you.
> That works, thanks. I had hoped that genfromtxt could do it because it can
skip the header and is presumably faster. But I'll take what I can get.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion