[Numpy-discussion] BOF notes: Fernando's proposal: NumPy?ndarray with named axes
Gael Varoquaux
gael.varoquaux@normalesup....
Mon Jul 12 05:03:30 CDT 2010
On Sun, Jul 11, 2010 at 11:59:30AM +0000, Neil Crighton wrote:
> What is a use case for the new array type that can't be solved by
> structured/record arrays? Sounds like it was decided at the Sciy
> BOF they were a good idea, several people have implemented a
> version of them and Fernando and Gael have both said they find
> them useful, so they must have something going for them. Maybe
> Fernando or Gael could share an example where arrays with named
> axes and indices are especially useful, for the peanut gallery's
> benefit?
Because my name is in this e-mail, I feel obliged to answer, but I think
that my usecases and opinions are not any more important than anybody
else.
Let say that you have a dataset that is in a 3D array, where axis 0
corresponds to days, axis 1 to hours of the day, and axis 2 to
temperature, you might want to have the mean of the temperature in each
day, which would be in current numpy:
data.mean(axis=0)
or the mean of the temperature at every hour, across the different days,
which would be:
data.mean(axis=1)
I do such manipulation all the time, and keeping track of which axis is
what is fairly tedious and error prone. It would be much nicer to be able
to write:
data.ax_day.mean(axis=0)
data.ax_hour.mean(axis=0)
Also, when dealing in a library with such data and writing functions,
it's quite easy to have errors in the computations coming from
transpositions or other reorderings. Given an array, I have no way of
telling what axis corresponds to what, and to trace such error. If my
library has a convention that each ndarray should have a named axis
called 'time', I know how to do timeseries analysis on multidimensional
data.
My 2 cents,
Gaël
More information about the NumPy-Discussion
mailing list