[Numpy-discussion] default axis for numarray

Perry Greenfield perry at stsci.edu
Mon Jun 10 13:37:02 CDT 2002


An issue that has been raised by scipy (most notably Eric Jones
and Travis Oliphant) has been whether the default axis used by
various functions should be changed from the current Numeric
default. This message is not directed at determining whether we
should change the current Numeric behavior for Numeric, but whether
numarray should adopt the same behavior as the current Numeric.

To be more specific, certain functions and methods, such as 
add.reduce(), operate by default on the first axis. For example,
if x is a 2 x 10 array, then add.reduce(x) results in a
10 element array, where elements in the first dimension has
been summed over rather than the most rapidly varying  dimension.

>>> x = arange(20)
>>> x.shape = (2,10)
>>> x
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
      [[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])
>>> add.reduce(x)
array([10, 12, 14, 16, 18, 20, 22, 24, 26, 28])

Some feel that is contrary to expectations that the least rapidly
varying dimension should be operated on by default. There are
good arguments for both sides. For example, Konrad Hinsen has 
argued that the current behavior is most compatible for behavior
of other Python sequences. For example, 

>>> sum = 0
>>> for subarr in x:
        sum += subarr

acts on the first axis in effect. Likewise

>>> reduce(add, x)

does likewise. In this sense, Numeric is currently more consistent
with Python behavior. However, there are other functions that
operate on the most rapidly varying dimension. Unfortunately 
I cannot currently access my old mail, but I think the rule
that was proposed under this argument was that if the 'reduction'
operation was of a structural kind, the first dimension is used.
If the reduction or processing step is 'time-series' oriented
(e.g., FFT, convolve) then the last dimension is the default.
On the other hand, some feel it would be much simpler to understand
if the last axis was the default always.

The question is whether there is a consensus for one approach or
the other. We raised this issue at a scientific Birds-of-a-Feather
session at the last Python Conference. The sense I got there was
that most were for the status quo, keeping the behavior as it is
now. Is the same true here? In the absence of consensus or a
convincing majority, we will keep the behavior the same for backward
compatibility purposes.

Perry






More information about the Numpy-discussion mailing list