[Numpy-discussion] Shape (z,0)

Anne Archibald aarchiba@physics.mcgill...
Fri Nov 28 04:25:32 CST 2008


2008/11/28 T J <tjhnson@gmail.com>:
>>>> import numpy as np
>>>> x = np.ones((3,0))
>>>> x
> array([], shape(3,0), dtype=float64)
>
> To preempt, I'm not really concerned with the answer to:  Why would
> anyone want to do this?
>
> I just want to know what is happening.  Especially, with
>
>>>> x[0,:] = 5
>
> (which works).  It seems that nothing is really happening here...given
> that, why is it allowed? Ie, are there reasons for not requiring the
> shape dimensions to be greater than 0?

So that scripts can work transparently with arrays of all sizes:

In [1]: import numpy as np

In [3]: a = np.random.randn(5); b = a[a>1]; print b.shape
(1,)

In [4]: a = np.random.randn(5); b = a[a>1]; print b.shape
(1,)

In [5]: a = np.random.randn(5); b = a[a>1]; print b.shape
(0,)

In [10]: b[:]=b[:]-1

The ":" just means "all", so it's fine to use it if there aren't any
("all of none").

Basically, if this were not permitted there would be a zillion little
corner cases code that was trying to be generic would have to deal
with.

There is a similar issue with zero-dimensional arrays for code that is
trying to be generic for number of dimensions. That is, you want to be
able to do something like:

In [12]: a = a-np.mean(a,axis=-1)[...,np.newaxis]

and have it work whether a is an n-dimensional array, in which case
np.mean(a,axis=-1) is (n-1)-dimensional, or a is a one-dimensional
array, in which case np.mean(a,axis=-1) is a zero-dimensional array,
or maybe a scalar:

In [13]: np.mean(a)
Out[13]: 0.0

In [14]: 0.0[...,np.newaxis]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)

/homes/janeway/aarchiba/<ipython console> in <module>()

TypeError: 'float' object is unsubscriptable

In [15]: np.mean(a)[...,np.newaxis]
Out[15]: array([ 0.])

Normalizing this stuff is still ongoing; it's tricky, because you
often want something like np.mean(a) to be just a number, but generic
code wants it to behave like a zero-dimensional array. Currently numpy
supports both "array scalars", that is, numbers of array dtypes, and
zero-dimensional arrays; they behave mostly alike, but there are a few
inconsistencies (and it's arguably redundant to have both).

That said, it is often far easier to write generic code by flattening
the input array, dealing with it as a guaranteed-one-dimensional
array, then reconstructing the correct shape at the end, but such code
is kind of a pain to write.

Anne


More information about the Numpy-discussion mailing list