[Numpy-discussion] loadtxt stop
Benjamin Root
ben.root@ou....
Fri Sep 17 14:59:11 CDT 2010
On Fri, Sep 17, 2010 at 2:50 PM, Zachary Pincus <zachary.pincus@yale.edu>wrote:
> > Though, really, it's annoying that numpy.loadtxt needs both the
> > readline function *and* the iterator protocol. If it just used
> > iterators, you could do:
> >
> > def truncator(fh, delimiter='END'):
> > for line in fh:
> > if line.strip() == delimiter:
> > break
> > yield line
> >
> > numpy.loadtxt(truncator(c))
> >
> > Maybe I'll try to work up a patch for this.
>
>
> That seemed easy... worth applying? Won't break compatibility, because
> the previous loadtxt required both fname.readline and fname.__iter__,
> while this requires only the latter.
>
>
> Index: numpy/lib/npyio.py
> ===================================================================
> --- numpy/lib/npyio.py (revision 8716)
> +++ numpy/lib/npyio.py (working copy)
> @@ -597,10 +597,11 @@
> fh = bz2.BZ2File(fname)
> else:
> fh = open(fname, 'U')
> - elif hasattr(fname, 'readline'):
> - fh = fname
> else:
> - raise ValueError('fname must be a string or file handle')
> + try:
> + fh = iter(fname)
> + except:
> + raise ValueError('fname must be a string or file handle')
> X = []
>
> def flatten_dtype(dt):
> @@ -633,14 +634,18 @@
>
> # Skip the first `skiprows` lines
> for i in xrange(skiprows):
> - fh.readline()
> + try:
> + fh.next()
> + except StopIteration:
> + raise IOError('End-of-file reached before
> encountering data.')
>
> # Read until we find a line with some values, and use
> # it to estimate the number of columns, N.
> first_vals = None
> while not first_vals:
> - first_line = fh.readline()
> - if not first_line: # EOF reached
> + try:
> + first_line = fh.next()
> + except StopIteration:
> raise IOError('End-of-file reached before
> encountering data.')
> first_vals = split_line(first_line)
> N = len(usecols or first_vals)
>
>
So, this code will still raise an error for an empty file. Personally, I
consider that a bug because I would expect to receive an empty array. I
could understand raising an error for a non-empty file that does not contain
anything useful. For comparison, Matlab returns an empty matrix for loading
an emtpy text file.
This has been a long-standing annoyance for me, along with the behavior with
a single-line data file.
Ben Root
