[Numpy-discussion] Adding the ability to "clone" a few fields from a data-type
Thu Oct 30 10:26:45 CDT 2008
On Thu, Oct 30, 2008 at 04:33, Francesc Alted <firstname.lastname@example.org> wrote:
> A Thursday 30 October 2008, Robert Kern escrigué:
>> On Wed, Oct 29, 2008 at 19:05, Travis E. Oliphant
>> <email@example.com> wrote:
>> > Hi all,
>> > I'd like to add to NumPy the ability to clone a data-type object so
>> > that only a view fields are copied over but that it retains the
>> > same total size.
>> > This would allow, for example, the ability to "select out a few
>> > records" from a structured array using
>> > subarr = arr.view(cloned_dtype)
>> > Right now, it is hard to do this because you have to at least add a
>> > "dummy" field at the end. A simple method on the dtype class
>> > (fromfields or something) would be easy to add.
>> I'm not sure what this accomplishes. Would the dummy fields that fill
>> in the space be inaccessible? E.g. tuple(subarr[i,j,k]) gives a tuple
>> with no numpy.void scalars? That would be a novel feature, but I'm
>> sure it fits the problem. On the contrary:
>> > It was thought in the past to do this with indexing
>> > arr['field1', 'field2']
>> > And that would still be possible (and mostly implemented) if this
>> > feature is added.
>> This appears more like the interface that people want. Except that I
>> think people were thinking that it would follow fancy indexing
>> arr[['field1', 'field2']]
> I've thought about that too. That would be a great thing to have, IMO.
>> I guess there are two ways to implement this. One is to make a new
>> array that just contains the desired fields. Another is to make a
>> view that just points to the desired fields in the original array
>> provided that we have a new feature for inaccessible dummy fields.
>> One point for the former approach is that it is closer to fancy
>> indexing which must always make a copy. The latter approach breaks
>> that connection.
> Yeah. I'd vote for avoid the copy.
>> OTOH, now that I think about it, I don't think there is really any
>> coherent way to mix field selection with any other indexing
>> operations. At least, not within the same brackets. Hmm. So maybe the
>> link to fancy indexing can be ignored as, ahem, fanciful.
> Well, one can always check that fields in the fancy list are either
> strings (map to name fields) or integers (map to positional fields).
> However, I'm not sure if this check would be too expensive.
That's not my concern. The problem is that the field-indexing applies
to the entire array, not just an axis. So what would the following
a[['foo', 'bar'], [1,2,3]]
>> Overall, I guess, I would present the feature slightly differently.
>> Provide a kind of inaccessible and invisible dtype for implementing
>> dummy fields. This is useful in other places like file parsing. At
>> the same time, implement a function that uses this capability to make
>> views with a subset of the fields of a structured array. I'm not sure
>> that people need an API for replacing the fields of a dtype like
> Mmh, not sure on what you are proposing there. You mean something like:
> In : t = numpy.dtype([('f0','i4'),('f1', 'f8'), ('f2', 'S20')])
> In : nt = t.astype(['f2', 'f0'])
> In : ra = numpy.zeros(10, dtype=t)
> In : nra = ra.view(nt)
> In : ra
> array([(0, 0.0, ''), (0, 0.0, ''), (0, 0.0, ''), (0, 0.0, ''),
> (0, 0.0, ''), (0, 0.0, ''), (0, 0.0, ''), (0, 0.0, ''),
> (0, 0.0, ''), (0, 0.0, '')],
> dtype=[('f0', '<i4'), ('f1', '<f8'), ('f2', '|S20')])
> In : nra
> array([('', 0), ('', 0), ('', 0), ('', 0), ('', 0), ('', 0), ('', 0),
> ('', 0), ('', 0), ('', 0)],
> dtype=[('f2', '|S20'), ('f0', '<i4')])
> In that case, that would be a great feature to add.
That's what Travis is proposing. I would like to see a function that
does this (however it is implemented under the covers):
nra = subset_fields(ra, ['f0', 'f2'])
With the view, I don't think you can reorder the fields as in your example.
"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
-- Umberto Eco
More information about the Numpy-discussion