[Numpy-discussion] Coercing object arrays to string (or unicode) arrays
Christopher Barker
Chris.Barker@noaa....
Thu Sep 24 12:02:39 CDT 2009
Michael Droettboom wrote:
> As I'm looking into fixing a number of bugs in chararray, I'm running
> into some surprising behavior.
> In [14]: x = np.array(['abcdefgh', 'ijklmnop'], 'O')
>
> # Without specifying the length, it seems to default to sizeof(int)... ???
> In [15]: np.array(x, 'S')
> Out[15]:
> array(['abcd', 'ijkl'],
> dtype='|S4')
This sure looks like a bug, and I'm no expert, but I suspect that it's
the size of a pointer (you are on a 32 system -- I am), which makes a
bit of sense, as Object arrays store a pointer to the python objects.
The question is, what should the array constructor do? perhaps the
equivalent of:
In [41]: np.array(x.tolist())
Out[41]:
array(['abcdefgh', 'ijklmnop'],
dtype='|S8')
which you could use as a work around.
Do you need to go through object arrays? could you go straight to a
string array:
np.array(['abcdefgh', 'ijklmnop'], np.string_)
Out[35]:
array(['abcdefgh', 'ijklmnop'],
dtype='|S8')
or just keep the strings in a list.
Object arrays are weird, I think there are a lot of corner cases.
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
More information about the NumPy-Discussion
mailing list