[Numpy-discussion] loadtxt stop

Zachary Pincus zachary.pincus@yale....
Fri Sep 17 14:34:44 CDT 2010


>> In the end, the question was; is worth adding start= and stop=  
>> markers
>> into loadtxt to allow grabbing sections of a file between two known
>> headers?  I imagine it's something that people come up against  
>> regularly.


Simple enough to wrap your file in a new file-like object that stops  
coughing up lines when the delimiter is found, no?

class TruncatingFile(object):
   def __init__(self, fh, delimiter='END'):
     self.fh = fh
     self.delimiter=delimiter
     self.done = False
   def readline(self):
     if self.done: return ''
     line = self.fh.readline()
     if line.strip() == self.delimiter:
       self.done = True
       return ''
     return line
   def __iter__(self):
     return self
   def next(self):
     line = self.fh.next()
     if line.strip() == self.delimiter:
       self.done = True
       raise StopIteration()
     return line

from StringIO import StringIO
c = StringIO("0 1\n2 3\nEND")
numpy.loadtxt(TruncatingFile(c))

Though, really, it's annoying that numpy.loadtxt needs both the  
readline function *and* the iterator protocol. If it just used  
iterators, you could do:

def truncator(fh, delimiter='END'):
   for line in fh:
     if line.strip() == delimiter:
       break
     yield line

numpy.loadtxt(truncator(c))

Maybe I'll try to work up a patch for this.

Zach



On Sep 17, 2010, at 2:51 PM, Christopher Barker wrote:

> Neil Hodgson wrote:
>> In the end, the question was; is worth adding start= and stop=  
>> markers
>> into loadtxt to allow grabbing sections of a file between two known
>> headers?  I imagine it's something that people come up against  
>> regularly.
>
> maybe not so regular. However, a common use would be to be able load
> only n rows, which also does not appear to be supported. That would  
> be nice.
>
> -Chris
>
>
>
>> Thanks,
>> Neil
>>
>> ------------------------------------------------------------------------
>> *From:* Neil Hodgson <hodgson.neil@yahoo.co.uk>
>> *To:* numpy-discussion@scipy.org
>> *Sent:* Fri, 17 September, 2010 14:17:12
>> *Subject:* loadtxt stop
>>
>> Hi,
>>
>> I been looking around and could spot anything on this.  Quite often I
>> want to read a homogeneous block of data from within a file.  The
>> skiprows option is great for missing out the section before the data
>> starts, but if there is anything below then loadtxt will choke.  I
>> wondered if there was a possibility to put an endmarker= ?
>>
>> For example, if I want to load text from a large! file that looks  
>> like this
>>
>> header line
>> header line
>> 1 2.0 3.0
>> 2 4.5 5.7
>> ...
>> 500 4.3 5.4
>> END
>> more headers
>> more headers
>> 1 2.0 3.0 3.14 1.1414
>> 2 4.5 5.7 1.14 3.1459
>> ...
>> 500 4.3 5.4 0.000 0.001
>> END
>>
>> Then I can use skiprows=2, but loadtxt will choke when it gets to
>> 'END'.  To read t
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> -- 
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker@noaa.gov
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion



More information about the NumPy-Discussion mailing list