[Numpy-discussion] Tabular data package

Robert Kern robert.kern@gmail....
Mon Oct 5 17:58:34 CDT 2009

On Mon, Oct 5, 2009 at 17:52, Elaine Angelino <elaine.angelino@gmail.com> wrote:
> On Mon, Oct 5, 2009 at 6:36 PM, Robert Kern <robert.kern@gmail.com> wrote:
>> > the main reason we went with the recarray over the ndarray is because
>> > the
>> > recarray has a couple of useful construction functions (e.g.
>> > np.rec.fromrecords and np.rec.fromarrays).  not only are these functions
>> > convenient to use, they have nice data type inference properties which
>> > we'd
>> > have to rebuild ourselves if we wanted to avoid recarrays entirely.
>> Try np.rec.fromrecords(...).view(np.ndarray).
> Hi Robert, thanks your email.  We definitely understand this use of
> .view().  However,  our question is,  should we have implemented tabular
> this way, e.g. in the tabarray constructor, first make a recarray and then
> view it as an ndarray?  (and then of course view it as a tabarray).

Do the minimum number of .view()s that you can get away with.

> This
> would have the effect of eliminating the extra recarray functionality, and
> some if its overhead as well. Is this the desirable design, or should we
> stick with recarrays?

Well, what other recarray functionality are you using? I addressed the
from*() functions because you said it was the main reason. What are
your other reasons?

> (Also, is first casting to recarrays and then viewing as ndarrays more
> expensive than if we went through ndarray directly?)

The overhead should be miniscule. No data is converted.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

More information about the NumPy-Discussion mailing list