[Numpy-discussion] Re: string matrices

Robert Kern robert.kern at gmail.com
Tue Apr 4 09:52:11 CDT 2006

Ryan Krauss wrote:
> I actually have a problem with the elements of a string matrix from
> astype('S#').  The shorter elements in my matrix have a bunch of terms
> like '1.0', because the matrix they started from was a float.  I need
> to keep the float type, but want to get rid of the '.0 ' when I
> convert the string output to latex.  I was going to check if
> element[-2:]=='.0' but ran into this problem:
> In [15]: temp[-2:]
> Out[15]: '\x00\x00'
> In [16]: temp.strip()
> Out[16]: '1.0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
> I think I can get rid of the \x00's by calling str(element), but is
> this a feature or a bug? 

Probably both.  :-)  On the one hand, you want to be able to get a useful string
out of the array; the nulls are just padding, and the string that you put in was
'1.0'. However, suppose that the string you put in was '1.\x00'. Then you would
get the "wrong" string out.

However, the only real alternative is to also store an integer containing the
length of the string with each element. That probably interferes with some of
the uses of string arrays.

> It would be slightly cleaner for me if the
> string matrix elements didn't have the trailing null characters (or
> whatever those are), but this may not be possible given the underlying
> representation.

You can also use temp.strip('\x00') which is a bit more explicit.

Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

More information about the Numpy-discussion mailing list