[Numpy-discussion] Python3, genfromtxt and unicode

Antony Lee antony.lee@berkeley....
Fri Apr 27 21:17:39 CDT 2012


With bytes fields, genfromtxt(dtype=None) sets the sizes of the fields to
the largest number of chars (npyio.py line 1596), but it doesn't do the
same for unicode fields, which is a pity.  See example below.
I tried to change npyio.py around line 1600 to add that but it didn't work;
from my limited understanding the problem comes earlier, in the way
StringBuilder is defined(?).
Antony Lee

import io, numpy as np
s = io.BytesIO()
s.write(b"abc 1\ndef 2")
s.seek(0)
t = np.genfromtxt(s, dtype=None) # (or converters={0: bytes})
print(t, t.dtype) # -> [(b'a', 1) (b'b', 2)] [('f0', '|S1'), ('f1', '<i8')]
s.seek(0)
t = np.genfromtxt(s, dtype=None, converters={0: lambda s:
s.decode("utf-8")})
print(t, t.dtype) # -> [('', 1) ('', 2)] [('f0', '<U0'), ('f1', '<i8')]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20120427/722e7190/attachment.html 


More information about the NumPy-Discussion mailing list