[Numpy-discussion] Adding the ability to "clone" a few fields from a data-type

Robert Kern robert.kern@gmail....
Thu Oct 30 10:24:37 CDT 2008


On Thu, Oct 30, 2008 at 08:27, Travis E. Oliphant
<oliphant@enthought.com> wrote:
>
>> I'm not sure what this accomplishes. Would the dummy fields that fill
>> in the space be inaccessible? E.g. tuple(subarr[i,j,k]) gives a tuple
>> with no numpy.void scalars? That would be a novel feature, but I'm not
>> sure it fits the problem. On the contrary:
>>
>
> Yes, that was the idea.   You can do it now, but only in C.   The real
> problem right now from my point of view is that there is no way to tell
> the dtype constructor to "pad the itemsize to x bytes".    If that were
> changed, then many things would be possible.
>
>> OTOH, now that I think about it, I don't think there is really any
>> coherent way to mix field selection with any other indexing
>> operations. At least, not within the same brackets. Hmm. So maybe the
>> link to fancy indexing can be ignored as, ahem, fanciful.
>>
> Yeah,  I was wondering how to do it well, myself, and couldn't come up
> with anything which is why I went the .view route with another dtype.
>
> By "inaccessible and invisible dtype" do you mean something like the
> basic built-in void data type, but which doesn't try to report itself
> when the dtype prints?

The field doesn't report itself when the *values* print, is what I'm
concerned with. The dtype should display the dummy fields such that
repr() can accurately reconstruct the dtype.

> That sounds interesting but I'm not sure it's necessary because the
> field specification can already skip bytes (just not bytes at the end
> --- which is what I would like to fix).    Perhaps what is needed is a
> "pseudo-dtype" (something like 'c' compared to 'S1') which doesn't
> actually create a new dtype but which is handled differently when the
> dtype is created with the [('field1', type), ('field2', type2)]
> approach.   Specifically, it doesn't add an entry to the fields
> dictionary nor an entry to the names but does affect the itemsize of the
> element (and the offset of follow-on fields).
>
> So, let's assume the character is 'v':
>
> If we have an array with underlying dtype:
>
> od = [('date', 'S10'), ('high', 'f4'), ('low', 'f4'), ('close', 'f4'),
> ('volume', 'i4')]
>
> Then, we could define a new dtype
>
> nd = [('date', 'S10'), ('', 'v8'), ('close', 'f4'), ('', 'v4')]

To do this, we would also have to fix the current behavior of
converting ''s to 'f0', 'f1', etc., when these are passed to the
dtype() constructor.

> and  arr.view(nd)   would provide a view of the array where element
> selection would be a tuple with just the date and close elements but the
> itemsize would be exactly the same but nd.names would be ['date', 'close']
>
> I like this approach.  It impacts the API the very least but provides
> the desired functionality.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


More information about the Numpy-discussion mailing list