[NumPy-Tickets] [NumPy] #1473: genfromtxt issue with EOL and/or unicode
NumPy Trac
numpy-tickets@scipy....
Tue May 4 08:38:14 CDT 2010
#1473: genfromtxt issue with EOL and/or unicode
--------------------------------------+-------------------------------------
Reporter: vincentdavis | Owner: somebody
Type: defect | Status: new
Priority: normal | Milestone: Unscheduled
Component: numpy.lib | Version: 1.4.0
Keywords: genfromtxt, unicode, EOL |
--------------------------------------+-------------------------------------
Basic problem, I file saved on a Mac using excel as a csv cannot me opened
with genfromtxt.
This will work, but if not expected to be necessary.
f = file('x.csv', 'U')
genfromtxt(f, ...)
Stéfan van der Walt has fixed the same issue in loadfromtxt
http://projects.scipy.org/numpy/changeset/8375
Long Story:
I ran into this issue and it was discussed on the pystatsmodels mailing
list.
Here is the setup
Running on a Mac 10.6
Using Office 2008
Saving an spreadsheet using excel "save as" a csv file.
Try to import using genfromtxt fails, report a EOL error
I thought this was because the EOL was wrong, It seems the file has '\r'
as the line ending (this may be wrong) anyway I changed it to '\n' and it
works fine. I am told (on the pystatsmodels mailing list) that this is
actually because the file is in unicode and that genfromtxt does not read
the EOL correctly.
To me it is a bug because one might expect a user to what to save a file
from excel and read it using genfromtxt. And for useres with little
experience the problem is not obvious.
I guess this is not a problem with py3?
ORIGINAL ATTEMPT
datatype = [('date','|S9'),('gpd','i8'),('temp','i8' ('precip','f16')]
data = np.genfromtxt('waterdata.csv', delimiter=',', skip_header=1,
dtype=datatype)
Traceback (most recent call last):
File
"/Applications/WingIDE.app/Contents/MacOS/src/debug/tserver/_sandbox.py",
line 1, in <module>
# Used internally for debug sandbox under external interpreter
File "/Library/Frameworks/EPD64.framework/Versions/6.1/lib/python2.6
/site-packages/numpy/lib/io.py", line 1048, in genfromtxt
raise IOError('End-of-file reached before encountering data.')
IOError: End-of-file reached before encountering data.
THIS DOES NOT WORK
>>> s = file('data_with_CR.csv','r')
>>> data = np.genfromtxt(s, delimiter=",", skip_header=1, dtype=None)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/EPD64.framework/Versions/6.1/lib/python2.6
/site-packages/numpy/lib/io.py", line 1048, in genfromtxt
raise IOError('End-of-file reached before encountering data.')
IOError: End-of-file reached before encountering data.
>>> data = np.genfromtxt(s, delimiter=",", , dtype=None)
File "<stdin>", line 1
data = np.genfromtxt(s, delimiter=",", , dtype=None)
THIS DOES WORK
>>> s = file('data_with_CR.csv','U')
>>> data = np.genfromtxt(s, delimiter=",", skip_header=1, dtype=None)
>>> data
array([('1/1/00', 8021472, 52, 0.02),
('1/2/00', 9496016, 46, 0.059999999999999998),
('1/3/00', 8478792, 29, 0.0), ..., ('12/29/02', 10790000, 61, 0.0),
('12/30/02', 9501000, 44, 0.0), ('12/31/02', 9288000, 53, 0.0)],
dtype=[('f0', '|S8'), ('f1', '<i8'), ('f2', '<i8'), ('f3', '<f8')])
--
Ticket URL: <http://projects.scipy.org/numpy/ticket/1473>
NumPy <http://projects.scipy.org/numpy>
My example project
More information about the NumPy-Tickets
mailing list