This seems to be a old problem but I've recently hit with this in a very
random way (I'm using numpy 1.6.1). There seems to be a ticket (1239)
but it seems the issue is unscheduled. Can somebody tell me if this is
fixed?
In particular, it makes for a very unstable behavior when you try to
reference something from a string array and pickle it across the wire. For
example:
In [1]: import numpy
In [2]: a = numpy.array(['a', '', 'b'])
In [3]: import cPickle
In [4]: s = cPickle.dumps(a[1])
In [5]: cPickle.loads(s)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/auto/cnvtvws/wlee/fstrat/src/<ipython-input-5-555fae2bd4f5> in <module>()
----> 1 cPickle.loads(s)
TypeError: ('data type not understood', <type 'numpy.dtype'>, ('S0', 0, 1))
Note that if you reference a[0] and a[2], it would work, so you're in the
case where sometimes it'd work but sometimes it won't. Checking for this
case in the code and work around it would really be a pain.
On Thu, Sep 24, 2009 at 7:03 PM, David Warde-Farley <dwf@cs.toronto.edu>wrote:
> On 23-Sep-09, at 7:55 PM, David Warde-Farley wrote:
>
> > It seems that either dtype(str) should do something more sensible than
> > zero-length string, or it should be possible to create it with
> > dtype('|
> > S0'). Which should it be?
>
>
> Since there wasn't any response I went ahead and fixed it by making
> str and unicode dtypes allow a size of 0 when constructed with
> protocol type codes. Either S0 and U0 should be constructable with
> typecodes or they shouldn't be allowed at all; I opted for the latter
> since a) it was simple and b) I don't know what a sensible default for
> dtype(str) would be (length 1? length 10?).
>
> Patch is at:
>
> http://projects.scipy.org/numpy/ticket/1239
>
> Review away!
>
> David
