[Numpy-discussion] BOF notes: Fernando's proposal: NumPy ndarray with named axes

Keith Goodman kwgoodman@gmail....
Fri Jul 9 16:12:42 CDT 2010

On Fri, Jul 9, 2010 at 1:53 PM, Rob Speer <rspeer@mit.edu> wrote:
> Keith Goodman wrote:
>> I ran into a few more questions while playing with datarrays, so I started a list:
>> http://github.com/kwgoodman/datarrayQ
> I have quick answers to some of the questions.

Thank you! Comments below.

>> Can I have ticks without labels?
> Ideally, yes, but good catch: the current code disallows that for no
> good reason.
>> Add a ticks input parameter?
> I very much approve of this proposal (to add ticks= to define ticks
> separately from axes).
>> Create Axis._tick_dict when needed?
> Wait, the dictionary wouldn't be saved at all? What's the point, then?
> Constant-time lookups of tick names are essential, and this proposal
> would turn that into linear time.

I guess it depends on what you do most. If you do a lot of indexing
with ticks then carrying around the mapping dict will speed things up.
If you are creating a lot of datarrays with ticks it will slow things

>> Can we prevent user from messing up a datarray?
> No. That's pretty much built into Python: the downstream user can do
> anything they want to.
> Our job is to make sure that what the user wants to do is use the
> datarray correctly. :)

True, but right now it is very easy to put a datarray in an
inconsistent state (change ticks without updating the mapping dict,
for example). Removing some of the datarray attributes that depend on
the state of other attributes would make it more robust. Not
suggesting that it is a net benefit.

>> 0d datarrays?
> As 0d datarrays are completely pointless, I'm pretty sure that any
> code that creates a 0d datarray is a mistake and should fail early.

Unless there is a good reason, making sure that datarrays behave like
an ndarray is a good target---less surprise for the user.

>> Can axis labels be anything besides None or str?
> Possibly. The part of this question I particularly like is accessing
> attributes programmatically, using arr.axis[axisname]. That gives
> .axis much more of a purpose. (Follow-up question: should we merge
> .axis and .axes in the API?)

Would be nice to have one that is dict-like, so dar.axis['labelname']
and one that is list-like, so dar.axis[idx] where idx is an int index.

>> Direct access to array?
> It's trivial: DataArray is a subclass of ndarray, so a DataArray
> already is an ndarray. If you want to strip off all the datarray stuff
> anyway (perhaps for efficiency reasons), you can use np.asarray(arr).

Does that come at the cost of a copy?

>> Support for alignment?
> Very yes. Aligning/joining labels is something that basically everyone
> who works with labeled data needs to do, so we should figure out the
> logic for it and include it in datarray so downstream users don't have
> to reinvent it.

So la.add(dar1, dar2, join='outer')?

>> Can labels and ticks be changed?
> I'd favor them being immutable, but could have my mind changed by a
> good use case for mutating them.
> -- Rob
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

More information about the NumPy-Discussion mailing list