[NumPy-Tickets] [NumPy] #1555: fromstring on Unicode objects behavior change on Py3

NumPy Trac numpy-tickets@scipy....
Sun Jul 25 11:15:51 CDT 2010


#1555: fromstring on Unicode objects behavior change on Py3
---------------------------------------------------+------------------------
 Reporter:  pv                                     |       Owner:  somebody
     Type:  defect                                 |      Status:  new     
 Priority:  low                                    |   Milestone:  1.5.0   
Component:  numpy.core                             |     Version:          
 Keywords:  Py3 fromstring behavior PyArg unicode  |  
---------------------------------------------------+------------------------
 Python 2:
 {{{
 >>> np.fromstring("\xe4".decode('latin1'), dtype=np.uint8)
 Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
 UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in
 position 0: ordinal not in range(128)
 }}}

 Python 3:
 {{{
 >>> np.fromstring(b"\xe4".decode('latin1'), dtype=np.uint8)
 array([195, 164], dtype=uint8)
 }}}

 The origin is a behavior change in `PyArg_ParseTuple*` `s#` format:

   - Python 2: "Unicode objects pass back a pointer to the default encoded
 string version of the object if such a conversion is possible."
 http://docs.python.org/c-api/arg.html
   - Python 3: "Unicode objects are converted to C strings using 'utf-8'
 encoding." http://docs.python.org/py3k/c-api/arg.html

-- 
Ticket URL: <http://projects.scipy.org/numpy/ticket/1555>
NumPy <http://projects.scipy.org/numpy>
My example project


More information about the NumPy-Tickets mailing list