[Numpy-discussion] A case for rank-0 arrays
Travis Oliphant
oliphant.travis at ieee.org
Thu Feb 23 21:34:01 CST 2006
Sasha wrote:
>The main criticism of supporting both scalars and rank-0 arrays is
>that it is "unpythonic" in the sense that it provides two almost
>equivalent ways to achieve the same result. However, I am now
>convinced that this is the case where practicality beats purity.
>
>
I think most of us agree that both will be with us for the indefinite
future.
>The situation with ndarrays is somewhat similar. A rank-N array is
>very similar to a function with N arguments, where each argument has a
>finite domain (i-th domain of a is range(a.shape[i])). A rank-0 array
>is just a function with no arguments and as such it is quite different
>from a scalar.
>
I can buy this view. Nicely done.
>Just as a function with no arguments cannot be
>replaced by a constant in the case when a value returned may change
>during the run of the program, rank-0 array cannot be replaced by an
>array scalar because it is mutable. (See
>http://projects.scipy.org/scipy/numpy/wiki/ZeroRankArray for use
>cases).
>
>Rather than trying to hide rank-0 arrays from the end-user and treat
>it as an implementation artifact, I believe numpy should emphasize the
>difference between rank-0 arrays and scalars and have clear rules on
>when to use what.
>
>
I agree. The problem is what should the rules be. Right now, there are
no clear rules other than rank-0 arrays --- DONT.
You make a case that we should not be so hard on rank-0 arrays.
>PROPOSALS
>==========
>
>Here are three suggestions:
>
>1. Probably the most controversial question is what getitem should
>return. I believe that most of the confusion comes from the fact that
>the same syntax implements two different operations: indexing and
>projection (for the lack of better name). Using the analogy between
>ndarrays and functions, indexing is just the application of the
>function to its arguments and projection is the function projection
>((f, x) -> lambda (*args): f(x, *args)).
>
>The problem is that the same syntax results in different operations
>depending on the rank of the array.
>
>Let
>
>
>>>>x = ones((2,2))
>>>>y = ones(2)
>>>>
>>>>
>
>then x[1] is projection and type(x[1]) is ndarray, but y[1] is
>indexing and type(y[1]) is int32. Similarly, y[1,...] is indexing,
>while x[1,...] is projection.
>
>I propose to change numpy rules so that if ellipsis is present inside
>[], the operation is always projection and both y[1,...] and
>x[1,1,...] return zero-rank arrays. Note that I have previously
>rejected Francesc's idea that x[...] and x[()] should have different
>meaning for zero-rank arrays. I was wrong.
>
>
I think this is a good and clear rule. And it seems like we may be
"almost" there.
Anybody want to implement it?
>2. Another source of ambiguity is the various "reduce" operations such
>as sum or max. Using the previous example, type(x.sum(axis=0)) is
>ndarray, but type(y.sum(axis=0)) is int32. I propose two changes:
>
> a. Make x.sum(axis) return ndarray unless axis is None, making
>type(y.sum(axis=0)) is ndarray true in the example.
>
>
>
Hmm... I'm not sure. y.sum(axis=0) is the default spelling of sum(y).
Thus, this would cause all old code to return a rank-0 array.
Most people who write sum(y) want a scalar, not a "function with 0
arguments"
> b. Allow axis to be a sequence of ints and make
>x.sum(axis=range(rank(x))) return rank-0 array according to the rule
>2.a above.
>
>
So, this would sum over multiple axes? I guess I'm not opposed to
something like that, but I'm not really excited about it either. Would
that make sense for all methods that take the axis= argument?
> c. Make x.sum() raise an error for rank-0 arrays and scalars, but
>allow x.sum(axis=()) to return x. This will make numpy sum consistent
>with the built-in sum that does not work on scalars.
>
>
>
I don't think I like this at all.
This proposal has more far-reaching implications (and would require more
code changes --- though the axis= arguments do have a converter function
and so would not be as painful as one might imagine).
In short, I don't feel as enthused about portion 2 of your proposal.
>3. This is a really small change currently
>
>
>>>>empty(())
>>>>
>>>>
>array(0)
>
>but
>
>
>
>
>I propose to make shape=() valid in ndarray constructor.
>
>
+1
I think we need more thinking about rank-0 arrays before doing something
like proposal 2. However, 1 and 3 seem simple enough to move forward
with...
-Travis
More information about the Numpy-discussion
mailing list