[Numpy-discussion] Automatic string length in recarray

David Warde-Farley dwf@cs.toronto....
Tue Nov 3 10:43:25 CST 2009


On 2-Nov-09, at 11:35 PM, Thomas Robitaille wrote:

> But if I want to specify the data types:
>
> np.rec.fromrecords([(1,'hello'),(2,'world')],dtype=[('a',np.int8),
> ('b',np.str)])
>
> the string field is set to a length of zero:
>
> rec.array([(1, ''), (2, '')], dtype=[('a', '|i1'), ('b', '|S0')])
>
> I need to specify datatypes for all numerical types since I care about
> int8/16/32, etc, but I would like to benefit from the auto string
> length detection that works if I don't specify datatypes. I tried
> replacing np.str by None but no luck. I know I can specify '|S5' for
> example, but I don't know in advance what the string length should be
> set to.

This is a limitation of the way the dtype code works, and AFAIK  
there's no easy fix. In some code I wrote recently I had to loop  
through the entire list of records i.e. max(len(foo[2]) for foo in  
records).

David


More information about the NumPy-Discussion mailing list