[Numpy-discussion] some question on new dtype

Travis Oliphant oliphant at ee.byu.edu
Wed Jan 25 13:37:03 CST 2006


N. Volbers wrote:

> Thanks for your quick answer!
>
>>> 1) When reading the sample chapter from Travis' documentation, I 
>>> noticed that there is also a type 'object' with the character 'O'. 
>>> So I kind of hoped that it would be possible to have arbitrary 
>>> python objects in an array. However, when I add a fourth "column" of 
>>> type 'O', then numpy will mem-fault. Is this not allowed or is this 
>>> some implementation bug?
>>
>>
>>
>> It's a bug if it seg-faults, that should be allowed.  Please post 
>> your code so we can track it down.
>
>
> Attachment 1 contains  a four-liner that reproduces the segfault. But 
> maybe there is something wrong on my system, because I also get the 
> following message when importing numpy: "import linalg -> failed: 
> /usr/lib/python2.4/site-packages/numpy/linalg/lapack_lite.so: 
> undefined symbol: s_wsfe"
>
> I removed numpy, checked out from SVN, re-installed and the error is 
> still there.
>
> Regarding my question about changing dtypes: Your sample code...
>
>> new = dict(a.dtype.fields) # get a writeable dictionary.
>> new['<newname>'] = new['<oldname>']
>> del new['<oldname>']
>> del new[-1]  # get rid of the special ordering entry
>> a.dtype = dtype(new)
>>
> ... is short and exactly what I asked for.
>
>>>
>>> 3) When I use two identical entries in the names part of the dtype, 
>>> I get the message 'TypeError: expected a readable buffer object'. It 
>>> makes sense that it is not allowed to have two identical names, but 
>>> I think the error message should be worded more descriptive.
>>
>>
>>
>> Yeah, we ought to check for this.  Could you post your code that 
>> raises this error.
>
>
> It seems you have already fixed the error message for the case in 
> which you have two identical names for dtype entries. Man, you are 
> fast! But... ;-) you forgot the situation described in my second code 
> attachment, which is the following situation:
>
> >>> dtype = numpy.dtype( [('name', 'S30'), ('name', 'i2')] )
> >>> a = numpy.zeros( (4,), dtype=dtype)
>
> This works, but leads to silly results. The last instance of 'name' 
> determines the field type, i.e. a[0] = ('Niklas', 1) is invalid, while 
> a[0] = (1,1) is valid!
>
>>> 4) In the example above, printing any of the strings via 'print' 
>>> will yield the characters and then the characters up to the string 
>>> size filled up with \x00, e.g.
>>>
>>>  u'Bill\x00\x00\x00\x00\x00\x00\x00.... (30 characters total)'
>>>
>>> Why doesn't 'prin't terminate the output when the first \x00 is 
>>> reached ?
>>
>>
>
> Thanks for your bug-fix. Unicode display works now.
>
>
> Best regards,
>
> Niklas Volbers.
>
>------------------------------------------------------------------------
>
>import numpy
>dtype = numpy.dtype( [('object', 'O')] )
>a = numpy.zeros( (4,), dtype=dtype)
>print a
>  
>
>  
>
I know what the problem is.    Too many places, I did not account for 
the fact that a VOID array could in-fact have OBJECT sub-fields.  Thus, 
any special processing on OBJECT subfields to handle reference counts 
must also be done on possible sub-fields.  

This may take a while to fix properly....  If anyone has any ideas, let 
me know. 

-Travis







More information about the Numpy-discussion mailing list