[Numpy-discussion] Adding the ability to "clone" a few fields from a data-type

Travis E. Oliphant oliphant@enthought....
Thu Oct 30 08:27:56 CDT 2008

> I'm not sure what this accomplishes. Would the dummy fields that fill
> in the space be inaccessible? E.g. tuple(subarr[i,j,k]) gives a tuple
> with no numpy.void scalars? That would be a novel feature, but I'm not
> sure it fits the problem. On the contrary:

Yes, that was the idea.   You can do it now, but only in C.   The real 
problem right now from my point of view is that there is no way to tell 
the dtype constructor to "pad the itemsize to x bytes".    If that were 
changed, then many things would be possible. 

> OTOH, now that I think about it, I don't think there is really any
> coherent way to mix field selection with any other indexing
> operations. At least, not within the same brackets. Hmm. So maybe the
> link to fancy indexing can be ignored as, ahem, fanciful.
Yeah,  I was wondering how to do it well, myself, and couldn't come up 
with anything which is why I went the .view route with another dtype.   

By "inaccessible and invisible dtype" do you mean something like the 
basic built-in void data type, but which doesn't try to report itself 
when the dtype prints?

That sounds interesting but I'm not sure it's necessary because the 
field specification can already skip bytes (just not bytes at the end 
--- which is what I would like to fix).    Perhaps what is needed is a 
"pseudo-dtype" (something like 'c' compared to 'S1') which doesn't 
actually create a new dtype but which is handled differently when the 
dtype is created with the [('field1', type), ('field2', type2)] 
approach.   Specifically, it doesn't add an entry to the fields 
dictionary nor an entry to the names but does affect the itemsize of the 
element (and the offset of follow-on fields).

So, let's assume the character is 'v':

If we have an array with underlying dtype:

od = [('date', 'S10'), ('high', 'f4'), ('low', 'f4'), ('close', 'f4'), 
('volume', 'i4')]

Then, we could define a new dtype

nd = [('date', 'S10'), ('', 'v8'), ('close', 'f4'), ('', 'v4')]

and  arr.view(nd)   would provide a view of the array where element 
selection would be a tuple with just the date and close elements but the 
itemsize would be exactly the same but nd.names would be ['date', 'close']

I like this approach.  It impacts the API the very least but provides 
the desired functionality.



> Overall, I guess, I would present the feature slightly differently.
> Provide a kind of inaccessible and invisible dtype for implementing
> dummy fields. This is useful in other places like file parsing. At the
> same time, implement a function that uses this capability to make
> views with a subset of the fields of a structured array. I'm not sure
> that people need an API for replacing the fields of a dtype like this.

More information about the Numpy-discussion mailing list