[Numpy-discussion] Adding the ability to "clone" a few fields from a data-type
Travis E. Oliphant
Thu Oct 30 08:27:56 CDT 2008
> I'm not sure what this accomplishes. Would the dummy fields that fill
> in the space be inaccessible? E.g. tuple(subarr[i,j,k]) gives a tuple
> with no numpy.void scalars? That would be a novel feature, but I'm not
> sure it fits the problem. On the contrary:
Yes, that was the idea. You can do it now, but only in C. The real
problem right now from my point of view is that there is no way to tell
the dtype constructor to "pad the itemsize to x bytes". If that were
changed, then many things would be possible.
> OTOH, now that I think about it, I don't think there is really any
> coherent way to mix field selection with any other indexing
> operations. At least, not within the same brackets. Hmm. So maybe the
> link to fancy indexing can be ignored as, ahem, fanciful.
Yeah, I was wondering how to do it well, myself, and couldn't come up
with anything which is why I went the .view route with another dtype.
By "inaccessible and invisible dtype" do you mean something like the
basic built-in void data type, but which doesn't try to report itself
when the dtype prints?
That sounds interesting but I'm not sure it's necessary because the
field specification can already skip bytes (just not bytes at the end
--- which is what I would like to fix). Perhaps what is needed is a
"pseudo-dtype" (something like 'c' compared to 'S1') which doesn't
actually create a new dtype but which is handled differently when the
dtype is created with the [('field1', type), ('field2', type2)]
approach. Specifically, it doesn't add an entry to the fields
dictionary nor an entry to the names but does affect the itemsize of the
element (and the offset of follow-on fields).
So, let's assume the character is 'v':
If we have an array with underlying dtype:
od = [('date', 'S10'), ('high', 'f4'), ('low', 'f4'), ('close', 'f4'),
Then, we could define a new dtype
nd = [('date', 'S10'), ('', 'v8'), ('close', 'f4'), ('', 'v4')]
and arr.view(nd) would provide a view of the array where element
selection would be a tuple with just the date and close elements but the
itemsize would be exactly the same but nd.names would be ['date', 'close']
I like this approach. It impacts the API the very least but provides
the desired functionality.
> Overall, I guess, I would present the feature slightly differently.
> Provide a kind of inaccessible and invisible dtype for implementing
> dummy fields. This is useful in other places like file parsing. At the
> same time, implement a function that uses this capability to make
> views with a subset of the fields of a structured array. I'm not sure
> that people need an API for replacing the fields of a dtype like this.
More information about the Numpy-discussion