[Numpy-discussion] Empty strings not empty?
Matthew Brett
matthew.brett@gmail....
Wed Dec 30 13:00:50 CST 2009
Hi.
> It isn't empty:
>
> In [3]: array(['\x00']).dtype
> Out[3]: dtype('|S1')
>
> In [4]: array(['\x00']).tostring()
> Out[4]: '\x00'
>
> In [5]: array(['\x00'])[0]
> Out[5]: ''
No, but my problem was that an empty string is not empty either, and
that you can't therefore distinguish between an empty string and a
string with all 0 bytes:
In [11]: np.array('') == '\x00\x00\x00'
Out[11]: array(True, dtype=bool)
> Looks like a printing problem to me, something in __repr__ for the string
> array. It seems that trailing zeros are trimmed off.
>
> In [11]: array(['a\x00\x00'])
> Out[11]:
> array(['a'],
> dtype='|S3')
>
> In [12]: array(['a\x00b'])
> Out[12]:
> array(['a\x00b'],
> dtype='|S3')
I don't think it's a printing problem, I think it's that the trailing
zeros are pulled off in the string comparisons, and for printing, even
though they are present in memory. I mean, that a.tostring() is
right, and the __repr__ and comparisons are - at least to me -
confusing.
In [2]: a = np.array('a\x00\x00\x00')
In [3]: a
Out[3]:
array('a',
dtype='|S4')
In [5]: a == 'a'
Out[5]: array(True, dtype=bool)
In [7]: a == 'a\x00\x00\x00'
Out[7]: array(True, dtype=bool)
See you,
Matthew
More information about the NumPy-Discussion
mailing list