[Numpy-discussion] loadtxt stop
Zachary Pincus
zachary.pincus@yale....
Fri Sep 17 14:34:44 CDT 2010
>> In the end, the question was; is worth adding start= and stop=
>> markers
>> into loadtxt to allow grabbing sections of a file between two known
>> headers? I imagine it's something that people come up against
>> regularly.
Simple enough to wrap your file in a new file-like object that stops
coughing up lines when the delimiter is found, no?
class TruncatingFile(object):
def __init__(self, fh, delimiter='END'):
self.fh = fh
self.delimiter=delimiter
self.done = False
def readline(self):
if self.done: return ''
line = self.fh.readline()
if line.strip() == self.delimiter:
self.done = True
return ''
return line
def __iter__(self):
return self
def next(self):
line = self.fh.next()
if line.strip() == self.delimiter:
self.done = True
raise StopIteration()
return line
from StringIO import StringIO
c = StringIO("0 1\n2 3\nEND")
numpy.loadtxt(TruncatingFile(c))
Though, really, it's annoying that numpy.loadtxt needs both the
readline function *and* the iterator protocol. If it just used
iterators, you could do:
def truncator(fh, delimiter='END'):
for line in fh:
if line.strip() == delimiter:
break
yield line
numpy.loadtxt(truncator(c))
Maybe I'll try to work up a patch for this.
Zach
On Sep 17, 2010, at 2:51 PM, Christopher Barker wrote:
> Neil Hodgson wrote:
>> In the end, the question was; is worth adding start= and stop=
>> markers
>> into loadtxt to allow grabbing sections of a file between two known
>> headers? I imagine it's something that people come up against
>> regularly.
>
> maybe not so regular. However, a common use would be to be able load
> only n rows, which also does not appear to be supported. That would
> be nice.
>
> -Chris
>
>
>
>> Thanks,
>> Neil
>>
>> ------------------------------------------------------------------------
>> *From:* Neil Hodgson <hodgson.neil@yahoo.co.uk>
>> *To:* numpy-discussion@scipy.org
>> *Sent:* Fri, 17 September, 2010 14:17:12
>> *Subject:* loadtxt stop
>>
>> Hi,
>>
>> I been looking around and could spot anything on this. Quite often I
>> want to read a homogeneous block of data from within a file. The
>> skiprows option is great for missing out the section before the data
>> starts, but if there is anything below then loadtxt will choke. I
>> wondered if there was a possibility to put an endmarker= ?
>>
>> For example, if I want to load text from a large! file that looks
>> like this
>>
>> header line
>> header line
>> 1 2.0 3.0
>> 2 4.5 5.7
>> ...
>> 500 4.3 5.4
>> END
>> more headers
>> more headers
>> 1 2.0 3.0 3.14 1.1414
>> 2 4.5 5.7 1.14 3.1459
>> ...
>> 500 4.3 5.4 0.000 0.001
>> END
>>
>> Then I can use skiprows=2, but loadtxt will choke when it gets to
>> 'END'. To read t
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> --
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R (206) 526-6959 voice
> 7600 Sand Point Way NE (206) 526-6329 fax
> Seattle, WA 98115 (206) 526-6317 main reception
>
> Chris.Barker@noaa.gov
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
More information about the NumPy-Discussion
mailing list