[Numpy-discussion] A case for rank-0 arrays

Sasha ndarray at mac.com
Fri Feb 24 09:16:04 CST 2006


On 2/24/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> Sasha wrote:
>...
> >I propose to change numpy rules so that if ellipsis is present inside
> >[], the operation is always projection and both y[1,...] and
> >x[1,1,...] return zero-rank arrays.  Note that I have previously
> >rejected Francesc's idea that x[...] and x[()] should have different
> >meaning for zero-rank arrays.  I was wrong.
> >
> >
> I think this is a good and clear rule.  And it seems like we may be
> "almost" there.
> Anybody want to implement it?
>
I'll implement it.  I think I am well prepared to handle this after I
implemented [] for rank-0 case.


> >2. Another source of ambiguity is the various "reduce" operations such
> >as sum or max.  Using the previous example, type(x.sum(axis=0)) is
> >ndarray, but type(y.sum(axis=0)) is int32.  I propose two changes:
> >
> >   a. Make x.sum(axis)  return ndarray unless axis is None, making
> >type(y.sum(axis=0)) is ndarray true in the example.
> >
> Hmm... I'm not sure.  y.sum(axis=0) is the default spelling of sum(y).
> Thus, this would cause all old code to return a rank-0 array.
>
> Most people who write sum(y) want a scalar, not a "function with 0
> arguments"
>

That's a valid concern.  Maybe we can first agree that it will be
helful to have some way of implementing the sum operation that always
returns ndarray even in the dimensionless case.  Once we agree on this
goal we can choose a spelling for such operation.  One possiblility is
if we implement (b) to keep old behavior for y.sum(axis=0), but make
y.sum(axis=(0,)) return an ndarray in all cases. The ugliness of that
spelling may be an advantage because it conveys "you know what you are
doing" message.

> >   b. Allow axis to be a sequence of ints and make
> >x.sum(axis=range(rank(x))) return rank-0 array according to the rule
> >2.a above.
> >
> So, this would sum over multiple axes?  I guess I'm not opposed to
> something like that, but I'm not really excited about it either.

It looks like this is the kind of proposal that has a better chance of
being adopted once someone implements it.  I will definitely implement
it if it becomes a requirement for (a) because I do need some way to
spell sum that does not change the type in the dimensionless case.

> Would that make sense for all methods that take the axis= argument?
>
I think so, but I did not review all the cases.


>
> >   c. Make x.sum() raise an error for rank-0 arrays and scalars, but
> >allow x.sum(axis=()) to return x.  This will make numpy sum consistent
> >with the built-in sum that does not work on scalars.
> >
> I don't think I like this at all.
>
Can you be more specific about what you don't like? Why numpy sum
should be different from built-in sum?  Numpy made dimentionless
arrays non-iterable, isn't it logical to make them non-summable as
well?

Note that in dimensionful case providing non-existing exis is an error:

>>> array([1]).sum(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: axis(=1) out of bounds


Why should not this be an error in the dimensionless case? Current
behavior is rather odd:
>>> array(1).sum(axis=0)
1
>>> array(1).sum(axis=1)
1

> >I propose to make shape=() valid in ndarray constructor.
> >
> >
> +1
Will do.

> I think we need more thinking about rank-0 arrays before doing something
> like proposal 2.  However, 1 and 3 seem simple enough to move forward
> with...

Sounds like a plan!




More information about the Numpy-discussion mailing list