[Numpy-discussion] some question on new dtype
Travis Oliphant
oliphant at ee.byu.edu
Wed Jan 25 13:37:03 CST 2006
N. Volbers wrote:
> Thanks for your quick answer!
>
>>> 1) When reading the sample chapter from Travis' documentation, I
>>> noticed that there is also a type 'object' with the character 'O'.
>>> So I kind of hoped that it would be possible to have arbitrary
>>> python objects in an array. However, when I add a fourth "column" of
>>> type 'O', then numpy will mem-fault. Is this not allowed or is this
>>> some implementation bug?
>>
>>
>>
>> It's a bug if it seg-faults, that should be allowed. Please post
>> your code so we can track it down.
>
>
> Attachment 1 contains a four-liner that reproduces the segfault. But
> maybe there is something wrong on my system, because I also get the
> following message when importing numpy: "import linalg -> failed:
> /usr/lib/python2.4/site-packages/numpy/linalg/lapack_lite.so:
> undefined symbol: s_wsfe"
>
> I removed numpy, checked out from SVN, re-installed and the error is
> still there.
>
> Regarding my question about changing dtypes: Your sample code...
>
>> new = dict(a.dtype.fields) # get a writeable dictionary.
>> new['<newname>'] = new['<oldname>']
>> del new['<oldname>']
>> del new[-1] # get rid of the special ordering entry
>> a.dtype = dtype(new)
>>
> ... is short and exactly what I asked for.
>
>>>
>>> 3) When I use two identical entries in the names part of the dtype,
>>> I get the message 'TypeError: expected a readable buffer object'. It
>>> makes sense that it is not allowed to have two identical names, but
>>> I think the error message should be worded more descriptive.
>>
>>
>>
>> Yeah, we ought to check for this. Could you post your code that
>> raises this error.
>
>
> It seems you have already fixed the error message for the case in
> which you have two identical names for dtype entries. Man, you are
> fast! But... ;-) you forgot the situation described in my second code
> attachment, which is the following situation:
>
> >>> dtype = numpy.dtype( [('name', 'S30'), ('name', 'i2')] )
> >>> a = numpy.zeros( (4,), dtype=dtype)
>
> This works, but leads to silly results. The last instance of 'name'
> determines the field type, i.e. a[0] = ('Niklas', 1) is invalid, while
> a[0] = (1,1) is valid!
>
>>> 4) In the example above, printing any of the strings via 'print'
>>> will yield the characters and then the characters up to the string
>>> size filled up with \x00, e.g.
>>>
>>> u'Bill\x00\x00\x00\x00\x00\x00\x00.... (30 characters total)'
>>>
>>> Why doesn't 'prin't terminate the output when the first \x00 is
>>> reached ?
>>
>>
>
> Thanks for your bug-fix. Unicode display works now.
>
>
> Best regards,
>
> Niklas Volbers.
>
>------------------------------------------------------------------------
>
>import numpy
>dtype = numpy.dtype( [('object', 'O')] )
>a = numpy.zeros( (4,), dtype=dtype)
>print a
>
>
>
>
I know what the problem is. Too many places, I did not account for
the fact that a VOID array could in-fact have OBJECT sub-fields. Thus,
any special processing on OBJECT subfields to handle reference counts
must also be done on possible sub-fields.
This may take a while to fix properly.... If anyone has any ideas, let
me know.
-Travis
More information about the Numpy-discussion
mailing list