FW: [Numpy-discussion] typecodes in numarray

Todd Miller jmiller at stsci.edu
Mon Jan 27 06:44:03 CST 2003


Francesc Alted wrote:

>Yeah. We just differ in the way to arrange this metadata to be passed to the
>recarray constructor. But I think this is secondary compared to the
>flexibility that a verbose approach offers compared with the actual string
>format. 
>
Yes.  So one question is:  if we were to add type-repetition tuples to 
recarray as an alternative to the current character code strings,  would 
that be any form of improvement to recarray from your perspective?

As I see it,  recarray currently has a clean seperation between format 
and naming which permits the latter to be optional.  Before changing 
that,  I'd need a clear argument why.  (I didn't design and generally 
don't even maintain recarray).

>In fact, more than one container might be supported to define the
>metadata; one can start with tuples as you suggest, but in the future other
>ways can be added (if considered convenient).
>  
>
>For example, I think I'll stick with the dictionary option for PyTables, but
>also a class declaration for the metadata would be supported, like in :
>
>class Small(IsRecord):
>    var1 = defineType(CharType, 2, "")
>    var2 = defineType(Int32, 1)
>    var3 = Float64
>
>This would not be difficult to support because, by accessing to the
>Small().__dict__, you get also a dictionary. In addition, the latter will
>ensure (by construction) that you are not using a non-valid python
>identifier, which is mandatory in my current implementation. I find these
>containers (dictionaries and classes) both elegant and convenient.
>  
>
I'm not trying to be Mr. Negative here,  but one thing to keep in mind 
is this:

 >>> class C:
...     pass
...
 >>> c = C()
 >>> dir(c.__dict__)
['__class__', '__cmp__', '__contains__', '__delattr__', '__delitem__', 
'__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__', 
'__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', 
'__lt__', '__ne__', '__new__', '__reduce__', '__repr__', '__setattr__', 
'__setitem__', '__str__', 'clear', 'copy', 'fromkeys', 'get', 'has_key', 
'items', 'iteritems', 'iterkeys', 'itervalues', 'keys', 'pop', 
'popitem', 'setdefault', 'update', 'values']

Which is to say,  the instance dictionary is a little cluttered,  and it 
might not be that easy to determine which objects in it are there to 
define the data format.

>>Just like the type repetition tuple except also including field names
>>and default values.   I don't think you lost me.  For what we do,  the
>>exact physical layout of the "struct" is important, so order matters.  I
>>see order as part of the
>>meta-data,  but I don't usually deal with meta-entities so maybe I've
>>got that part wrong.  :)
>>
>
>Well, if you need positional fields, you may add a (optional) parameter,
>called for example, "position" so that you can fix it. 
>  
>
I'm sure that's not the easiest way to capture struct layout,  but I 
take your point.   Since position matters to me,  I'd prefer that 
capturing them was implicit.   Since it doesn't to you, it seems OK for 
it to be explicit.   Either default mode can support the other,  but 
capturing order with tuples is free,  while capturing order with a 
__dict__ will take some kind of extra work.

>>I was thinking that if the above was an issue,  we could write an API
>>function(s) to "compile" the type-repetition tuple into arrays of ints
>>which describe the type of each field and corresponding repetition factor.
>>    
>>
>
>Yeah, I agree that this would be the best solution. That way, the charcodes
>will be factored out from the code, and by just providing such and API (both
>in Python and C), would be enough to reconstruct them, if needed. That will
>allow a more consistent numarray internal code. 
>  
>
I'm thinking the general format for this may be converting N-tuples of 
types and ints into N arrays of types and ints.  And vice versa.
It's obvious how this works with numarray types.  I think the chararray 
types need work and need to be mapped into the same integer enumeration 
as the numeric types in a non-overlapping way.

>See you Monday,
>  
>
>
>Right, how did you know that? :)
>  
>
Insightful on weekends anyway, 
Todd






More information about the Numpy-discussion mailing list