[NumPy-Tickets] [NumPy] #2184: Remove
NumPy Trac
numpy-tickets@scipy....
Wed Jul 11 16:51:06 CDT 2012
#2184: Remove
-----------------------+----------------------------------------------------
Reporter: khaeru | Owner: somebody
Type: defect | Status: new
Priority: normal | Milestone: Unscheduled
Component: numpy.lib | Version: 1.6.1
Keywords: |
-----------------------+----------------------------------------------------
The documentation for `genfromtxt()` reads:
When the variables are named (either by a flexible dtype or with
''names'', there must not be any header in the file (else a ValueError
exception is raised).
and also:
If ''names'' is True, the field names are read from the first valid line
after the first ''skip_header'' lines.
The cause of this seems to be in
[https://github.com/numpy/numpy/blob/master/numpy/lib/npyio.py#L1347
numpy/lib/npyio.py at lines 1347-9]:
{{{
if names is True:
if comments in first_line:
first_line = asbytes('').join(first_line.split(comments)[1:])
}}}
'''The last line should read `first_line =
first_line.split(comments)[0]`.'''
With the current code, the input line:
{{{
# Example comment line
}}}
will be transformed to:
{{{
Example comment line
}}}
resulting in columns named 'Example', 'comment' and 'line' (this is what
the warning in the documentation is about).
But also the input line:
{{{
ColumnA ColumnB ColumnC # the column names precede this comment
}}}
will be transformed to:
{{{
the column names precede this comment
}}}
resulting in columns named 'the', 'column', 'names' …etc. In this instance
actual column names present in the file are inappropriately discarded.
By taking the `[0]` portion of the split instead of `[1:]`:
* Lines beginning with comments result in an empty string being passed to
`split_lines()` on L1350, producing no usable output and causing the
`while not first_values` loop to try the next line.
* Partial-line comments following actual heading names are discarded,
instead of the names themselves.
* As a result, files can have commented headers of any length ''and''
column names, simultaneously.
--
Ticket URL: <http://projects.scipy.org/numpy/ticket/2184>
NumPy <http://projects.scipy.org/numpy>
My example project
More information about the NumPy-Tickets
mailing list