[Numpy-discussion] Adding the ability to "clone" a few fields from a data-type

Francesc Alted faltet@pytables....
Thu Oct 30 11:59:52 CDT 2008


A Thursday 30 October 2008, Robert Kern escrigué:
[clip]
> >> OTOH, now that I think about it, I don't think there is really any
> >> coherent way to mix field selection with any other indexing
> >> operations. At least, not within the same brackets. Hmm. So maybe
> >> the link to fancy indexing can be ignored as, ahem, fanciful.
> >
> > Well, one can always check that fields in the fancy list are either
> > strings (map to name fields) or integers (map to positional
> > fields). However, I'm not sure if this check would be too
> > expensive.
>
> That's not my concern. The problem is that the field-indexing applies
> to the entire array, not just an axis. So what would the following
> mean?
>
>   a[['foo', 'bar'], [1,2,3]]
>
> Compared to
>
>   a[[5,8,10], [1,2,3]]

Well, as I see them, fields are like another axis, just that it is 
always the leading one.  In order to cope with them we could use a 
generalization of what it works already:

In [15]: ra = numpy.zeros((3,4), "i4,f4")

In [16]: ra['f1'][[1,2],[0,3]]  # this already works
Out[16]: array([ 0.,  0.], dtype=float32)

In [17]: ra[['f1','f2']][[1,2],[0,3]]   # this could be make to work
Out[17]:
array([(0, 0.0), (0, 0.0)],
      dtype=[('f0', '<i4'), ('f1', '<f4')])

> >> Overall, I guess, I would present the feature slightly
> >> differently. Provide a kind of inaccessible and invisible dtype
> >> for implementing dummy fields. This is useful in other places like
> >> file parsing. At the same time, implement a function that uses
> >> this capability to make views with a subset of the fields of a
> >> structured array. I'm not sure that people need an API for
> >> replacing the fields of a dtype like this.
> >
> > Mmh, not sure on what you are proposing there.  You mean something
> > like:
> >
> > In [21]: t = numpy.dtype([('f0','i4'),('f1', 'f8'), ('f2', 'S20')])
> >
> > In [22]: nt = t.astype(['f2', 'f0'])
> >
> > In [23]: ra = numpy.zeros(10, dtype=t)
> >
> > In [24]: nra = ra.view(nt)
> >
> > In [25]: ra
> > Out[25]:
> > array([(0, 0.0, ''), (0, 0.0, ''), (0, 0.0, ''), (0, 0.0, ''),
> >       (0, 0.0, ''), (0, 0.0, ''), (0, 0.0, ''), (0, 0.0, ''),
> >       (0, 0.0, ''), (0, 0.0, '')],
> >      dtype=[('f0', '<i4'), ('f1', '<f8'), ('f2', '|S20')])
> >
> > In [26]: nra
> > Out[26]:
> > array([('', 0), ('', 0), ('', 0), ('', 0), ('', 0), ('', 0), ('',
> > 0), ('', 0), ('', 0), ('', 0)],
> >      dtype=[('f2', '|S20'), ('f0', '<i4')])
> >
> > ?
> >
> > In that case, that would be a great feature to add.
>
> That's what Travis is proposing. I would like to see a function that
> does this (however it is implemented under the covers):
>
>   nra = subset_fields(ra, ['f0', 'f2'])

Interesting.

> With the view, I don't think you can reorder the fields as in your
> example.

That's a pity.  Providing a dtype with the notion of an internal reorder 
can be very powerful in some situations.  But I guess that implementing 
this would be complicated.

-- 
Francesc Alted


More information about the Numpy-discussion mailing list