FW: [Numpy-discussion] Proposed record array behavior: the rest of the story: updated
perry at stsci.edu
Wed Jul 28 15:02:04 CDT 2004
I guess I've seen enough discussion to try to refine the last delta into
what is the last (or next to last) version:
So here are the changes to the last updated proposal:
1) I originally intended to narrow attribute access to strictly legal names
as Rick White suggested but something got into me to try to handle spaces. I
agree with Rick on this. I see that as a very simple rule to remember and
don't see it as confusing to allow this.
2) Attribute access still won't be permitted directly on record arrays or
records. I'm very much in agreement with Francesc that "fields" is more
suggestive than "field" as to the record and record array object that
permits both indexing and attribute access by name. The use of the field
method will remain, but will eventually be deprecated. As to other names,
namely cols, I'll stick with fields since it started with that usage, and
that field is a more appropriate term when dealing with multidimensional
record arrays (columns is much more suggestive of simple tables).
3) It will not be possible to index record arrays by column name. So
will not be permitted, but
will. Nor will
Rarr[32, "column 1"]
4) As for optional labels (for display purposes) I'd like to hold off. I
would like to have only one way to associate a name with a field and until
it is clearer what extra record array functionality would be associated with
labels, I'd rather not include them. Even then, I'm not sure I want to see
too much more dragged in (e.g., units, display formats, etc.) These sorts of
things may be more appropriate for a subclass.
I realize that no single person will be happy with these choices, but they
seem to me to be the best compromise without unduly complicating things,
restricting future enhancements, and being to hard to implement.
Has anything fallen into a crack?
So what follows is a updated version of what I last sent out:
1) Russell Owen asked that indexing by field name not be permitted for
record arrays and at least one other agreed. Since it is easier to add
something like this later rather than take it away, I'll go along with that.
So while it will be possible to index a Record by field name, it won't be
for record arrays.
2) Russell asked if it would be possible to specify the types of the fields
using numarray/chararray type objects. Yes, it will. We will adopt Rick
White's 2nd suggestion for handling fields that themselves are arrays, I.e.,
formats = (3,Int16), ((4,5), Float32)
For a 1-d Int16 cell of shape (3,) and a 2-d Float32 cell of shape (4,5)
The first suggestion ("formats = 3*(Int16,), 4*(5*(Float32,),)") will not be
supported. While it is very suggestive, it does allow for inconsistent
nestings that must be checked and rejected (what if someone supplies
(Int16, Int16, Float32) as one of the fields?) which complicates the code.
It doesn't read as well.
3) Russell also suggested nesting record arrays. This sort of capability is
not being ruled out, but there isn't a chance we can devote resources to
this any time soon (can anyone else?)
4) To address the suggestions of Russell and Francesc, I'm proposing that a
new attribute "fields" bed added that allows:
a) indexing by name or number (just like Records)
b) name as attributes so long as the name is allowable as a legal
attribute. No attempt will be made to map names that are not legal attribute
strings into a different attribute name.
The field method will remain and be eventually deprecated.
Note that the only real need to support indexing other than consistency is
to support slices. Only slices for numerical indexing will be supported (and
not initially). The callable syntax can support index arrays just as easily.
Will all work for a field named "home address" but this field cannot be
specified as an attribute of Rarr.fields
If there is a field named "intensity" then
Will be permitted.
More information about the Numpy-discussion