[NumPy-Tickets] [NumPy] #2124: genfromtxt and unicode strings

NumPy Trac numpy-tickets@scipy....
Tue May 1 19:58:10 CDT 2012


#2124: genfromtxt and unicode strings
-----------------------+----------------------------------------------------
 Reporter:  anntzer    |       Owner:  somebody   
     Type:  defect     |      Status:  new        
 Priority:  normal     |   Milestone:  Unscheduled
Component:  numpy.lib  |     Version:  1.6.1      
 Keywords:             |  
-----------------------+----------------------------------------------------
 With bytes (in Python 3-speak) fields, genfromtxt(dtype=None) sets the
 sizes of the fields to the largest number of chars (npyio.py line 1596),
 but it doesn't do the same for unicode fields, which is a pity.  See
 example:

 {{{
 import io, numpy as np
 s = io.BytesIO()
 s.write(b"abc 1\ndef 2")
 s.seek(0)
 t = np.genfromtxt(s, dtype=None) # (or converters={0: bytes})
 print(t, t.dtype) # -> [(b'a', 1) (b'b', 2)] [('f0', '|S1'), ('f1',
 '<i8')]
 s.seek(0)
 t = np.genfromtxt(s, dtype=None, converters={0: lambda s:
 s.decode("utf-8")})
 print(t, t.dtype) # -> [('', 1) ('', 2)] [('f0', '<U0'), ('f1', '<i8')]
 }}}

 I tried to change npyio.py around line 1600 to add that but it didn't
 work; from my limited understanding the problem comes earlier, in the way
 StringBuilder is defined(?).

-- 
Ticket URL: <http://projects.scipy.org/numpy/ticket/2124>
NumPy <http://projects.scipy.org/numpy>
My example project


More information about the NumPy-Tickets mailing list