[NumPy-Tickets] [NumPy] #1555: fromstring on Unicode objects behavior change on Py3
NumPy Trac
numpy-tickets@scipy....
Sun Jul 25 11:15:51 CDT 2010
#1555: fromstring on Unicode objects behavior change on Py3
---------------------------------------------------+------------------------
Reporter: pv | Owner: somebody
Type: defect | Status: new
Priority: low | Milestone: 1.5.0
Component: numpy.core | Version:
Keywords: Py3 fromstring behavior PyArg unicode |
---------------------------------------------------+------------------------
Python 2:
{{{
>>> np.fromstring("\xe4".decode('latin1'), dtype=np.uint8)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in
position 0: ordinal not in range(128)
}}}
Python 3:
{{{
>>> np.fromstring(b"\xe4".decode('latin1'), dtype=np.uint8)
array([195, 164], dtype=uint8)
}}}
The origin is a behavior change in `PyArg_ParseTuple*` `s#` format:
- Python 2: "Unicode objects pass back a pointer to the default encoded
string version of the object if such a conversion is possible."
http://docs.python.org/c-api/arg.html
- Python 3: "Unicode objects are converted to C strings using 'utf-8'
encoding." http://docs.python.org/py3k/c-api/arg.html
--
Ticket URL: <http://projects.scipy.org/numpy/ticket/1555>
NumPy <http://projects.scipy.org/numpy>
My example project
More information about the NumPy-Tickets
mailing list