[Numpy-discussion] some question on new dtype

N. Volbers mithrandir42 at web.de
Tue Jan 24 23:03:01 CST 2006


Hello everyone on the list!

I have been playing around with the latest and greatest numpy 0.94 and 
its dtype mechanism. I am especially interested in using the 
record-array-like facilities, e.g. the following which is based on an 
example from a mail of Travis to this list:

<--
# define array with three "columns".
dtype = numpy.dtype( {'names': ['name', 'age', 'weight'],
  'formats': ['U30', 'i2', numpy.float32]} )
a = numpy.array( [(u'Bill', 31, 260), (u'Fred', 15, 135)], dtype=dtype )

# specify column by key
print a ['name']
print a['age']
print a['weight']
#print a['object']

# specify row by number
print a[0]
print a[1]

# first item of row 1 (Fred's age)
print a[1]['age']

# first item of name column (name 'Bill')
print a['name'][0]
-->

I now have a few questions, maybe someone can help me with them:

1) When reading the sample chapter from Travis' documentation, I noticed 
that there is also a type 'object' with the character 'O'. So I kind of 
hoped that it would be possible to have arbitrary python objects in an 
array. However, when I add a fourth "column" of type 'O', then numpy 
will mem-fault. Is this not allowed or is this some implementation bug?

2) Is it possible to rename the type descriptors? For my application, I 
need to treat these names as keys of dataset columns, so it should be 
possible to rename these.  More generally speaking: Is it possible to 
alter parts of the dtype after instantiation? Of course it should be 
possible to copy the dtype, modify it accordingly and create a new 
array. However, maybe there is a suggested way to doing this?

3) When I use two identical entries in the names part of the dtype, I 
get the message 'TypeError: expected a readable buffer object'. It makes 
sense that it is not allowed to have two identical names, but I think 
the error message should be worded more descriptive.

4) In the example above, printing any of the strings via 'print' will 
yield the characters and then the characters up to the string size 
filled up with \x00, e.g.

  u'Bill\x00\x00\x00\x00\x00\x00\x00.... (30 characters total)'

Why doesn't 'prin't terminate the output when the first \x00 is reached ?



Overall I am very much impressed by the new numpy and I thank everyone 
who contributes to this!

Niklas Volbers.






More information about the Numpy-discussion mailing list