[Numpy-discussion] Coercing object arrays to string (or unicode) arrays
Michael Droettboom
mdroe@stsci....
Wed Sep 23 14:18:24 CDT 2009
As I'm looking into fixing a number of bugs in chararray, I'm running
into some surprising behavior. One of the things chararray needs to do
occasionally is build up an object array of string objects, and then
convert that back to a fixed-length string array. This length is
sometimes predetermined by a recarray data structure. Unfortunately,
I'm not getting what I would expect when coercing or assigning an object
array to a string array. Is this a bug, or am I just going about this
the wrong way? If a bug, I'm happy to look into it as part of my
"fixing chararray" task, but I just wanted to confirm that it is a bug
before proceeding.
In [14]: x = np.array(['abcdefgh', 'ijklmnop'], 'O')
# Without specifying the length, it seems to default to sizeof(int)... ???
In [15]: np.array(x, 'S')
Out[15]:
array(['abcd', 'ijkl'],
dtype='|S4')
In [21]: np.array(x, np.string_)
Out[21]:
array(['abcd', 'ijkl'],
dtype='|S4')
# Specifying a length gives strange results
In [16]: np.array(x, 'S8')
Out[16]:
array(['abcdijkl', 'mnop\xe0\x01\x85\x08'],
dtype='|S8')
# This is what I expected to happen above, but the cast to a list seems
like it should be unnecessary
In [17]: np.array(list(x))
Out[17]:
array(['abcdefgh', 'ijklmnop'],
dtype='|S8')
# Assignment also seems broken
In [18]: y = np.empty(x.shape, dtype='S8')
In [19]: y[:] = x[:]
In [20]: y
Out[20]:
array(['abcdijkl', 'mnop\xc05\xf9\xb7'],
dtype='|S8')
Cheers,
Mike
More information about the NumPy-Discussion
mailing list