[NumPy-Tickets] [NumPy] #1473: genfromtxt issue with EOL and/or unicode

NumPy Trac numpy-tickets@scipy....
Tue May 4 08:38:14 CDT 2010


#1473: genfromtxt issue with EOL and/or unicode
--------------------------------------+-------------------------------------
 Reporter:  vincentdavis              |       Owner:  somebody   
     Type:  defect                    |      Status:  new        
 Priority:  normal                    |   Milestone:  Unscheduled
Component:  numpy.lib                 |     Version:  1.4.0      
 Keywords:  genfromtxt, unicode, EOL  |  
--------------------------------------+-------------------------------------
 Basic problem, I file saved on a Mac using excel as a csv cannot me opened
 with genfromtxt.
 This will work, but if not expected to be necessary.
 f = file('x.csv', 'U')
 genfromtxt(f, ...)

 Stéfan van der Walt has fixed the same issue in loadfromtxt
 http://projects.scipy.org/numpy/changeset/8375

 Long Story:
 I ran into this issue and it was discussed on the pystatsmodels mailing
 list.
 Here is the setup
 Running on a Mac 10.6
 Using Office 2008
 Saving an spreadsheet using excel "save as" a csv file.

 Try to import using genfromtxt fails, report a EOL error
 I thought this was because the EOL was wrong, It seems the file has '\r'
 as the line ending (this may be wrong) anyway I changed it to '\n' and it
 works fine. I am told (on the pystatsmodels mailing list) that this is
 actually because the file is in unicode and that genfromtxt does not read
 the EOL correctly.

 To me it is a bug because one might expect a user to what to save a file
 from excel and read it using genfromtxt. And for useres with little
 experience the problem is not obvious.

 I guess this is not a problem with py3?

 ORIGINAL ATTEMPT

 datatype = [('date','|S9'),('gpd','i8'),('temp','i8' ('precip','f16')]
 data = np.genfromtxt('waterdata.csv', delimiter=',', skip_header=1,
 dtype=datatype)

 Traceback (most recent call last):
   File
 "/Applications/WingIDE.app/Contents/MacOS/src/debug/tserver/_sandbox.py",
 line 1, in <module>
     # Used internally for debug sandbox under external interpreter
   File "/Library/Frameworks/EPD64.framework/Versions/6.1/lib/python2.6
 /site-packages/numpy/lib/io.py", line 1048, in genfromtxt

     raise IOError('End-of-file reached before encountering data.')
 IOError: End-of-file reached before encountering data.


 THIS DOES NOT WORK

 >>> s = file('data_with_CR.csv','r')
 >>> data = np.genfromtxt(s, delimiter=",", skip_header=1, dtype=None)
 Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/Library/Frameworks/EPD64.framework/Versions/6.1/lib/python2.6
 /site-packages/numpy/lib/io.py", line 1048, in genfromtxt
     raise IOError('End-of-file reached before encountering data.')
 IOError: End-of-file reached before encountering data.
 >>> data = np.genfromtxt(s, delimiter=",", , dtype=None)
   File "<stdin>", line 1
     data = np.genfromtxt(s, delimiter=",", , dtype=None)


 THIS DOES WORK

  >>> s = file('data_with_CR.csv','U')
 >>> data = np.genfromtxt(s, delimiter=",", skip_header=1, dtype=None)
 >>> data
 array([('1/1/00', 8021472, 52, 0.02),
        ('1/2/00', 9496016, 46, 0.059999999999999998),
        ('1/3/00', 8478792, 29, 0.0), ..., ('12/29/02', 10790000, 61, 0.0),
        ('12/30/02', 9501000, 44, 0.0), ('12/31/02', 9288000, 53, 0.0)],
       dtype=[('f0', '|S8'), ('f1', '<i8'), ('f2', '<i8'), ('f3', '<f8')])

-- 
Ticket URL: <http://projects.scipy.org/numpy/ticket/1473>
NumPy <http://projects.scipy.org/numpy>
My example project


More information about the NumPy-Tickets mailing list