[Numpy-discussion] Numarray design announcement
Paul F Dubois
paul at pfdubois.com
Mon Jul 22 16:14:03 CDT 2002
At numpy.sf.net you will find a posting from Perry Greenfield and I
detailing the design decisions we have taken with respect to Numarray.
What follows is the text of that message without the formatting. We ask
for your understanding about those decisions that differ from the ones
you might prefer.
Paul F. Dubois and Perry Greenfield
Numarray is the new implementation of the Numeric Python extension. It
is our intention that users will change as rapidly as possible to the
new module when we decide it is ready. The present Numeric Python team
will cease supporting Numeric after a short transition period.
During recent months there has been a lot of discussion about Numarray
and whether or not it should differ from Numeric in certain ways. We
have reviewed this lengthy discussion and come to some conclusions about
what we plan to do. The discussion has been valuable in that it took a
whole new "generation" back through the considerations that the
"founding fathers" debated when Numeric Python was designed.
There are literally tens of thousands of Numerical Python users. These
users may represent only a tiny percentage of potential users but they
are real users today with real code that they have written, and breaking
that code would represent real harm to real people. Most of the issues
discussed recently were discussed at length when Numeric was first
designed. Some decisions taken then represent a choice that was simply a
choice among valid alternatives. Nevertheless, the choice was made, and
to arbitrarily now make a different choice would be difficult to
In arguing about Python's indentation, we often see heart-felt arguments
from opponents who have sincere reasons for feeling as they do. However,
many of the pitfalls they point to do not seem to actually occur in real
life very often. We feel the same way about many arguments about Numeric
Python. The view / copy argument, for example, claims that beginners
will make errors with view semantics. Well, some do, but not very often,
and not twice. It is just one of many differences that users need to
adapt to when learning an entity-object model such as Python's when they
are used to variable semantics such as in Fortran or C. Similarly, we do
not receive massive reports of confusion about differing default values
for the axis keyword -- there was a rationale for the way it is now, and
although one could propose a different rationale for a different choice,
it would be just a choice.
Numarray will have the same Python interface as Numeric except for the
exceptions discussed below.
1. The Numarray C API includes a compatibility layer consisting of some
of the members of the Numeric C API. For details on compatibility at the
C level see
http://telia.dl.sourceforge.net/sourceforge/numpy/numarray.pdf , pdf
pages 78-81. Since no formal decision was ever made about what parts of
the Numeric C header file were actually intended to be publicly
available, do not expect complete emulation.
Numarray's current view of arrays in C, using either native or emulation
C-APIs, is that array data can be mutated, but array properties cannot.
Thus, an existing Numeric extension function which tries to change the
shape or strides of an array in C is more of a porting challenge,
possibly requiring a python wrapper. Depending on what kind of
optimization we do, this restriction might be lifted. For the Numeric
extensions already ported to Numarray (RandomArray, LinearAlgebra, FFT),
none of this was an issue.
2. Currently, if the result of an index operation x[i] results in a
scalar result, the result is converted to a similar Python type. For
example, the result of array([1,2,3]) is the Python integer 2. This
will be changed so that the result of an index operation on a Numarray
array is always a Numarray array. Scalar results will become rank-zero
arrays (i.e., shape () ).
3. Currently, binary operations involving Numeric arrays and Python
scalars uses the precision of the Python scalar to help determine the
precision of the result. In Numarray, the precision of the array will
have precedence in determining the precision of the outcome. Full
details are available in the Numarray documention.
4. The Numarray version of MA will no longer have copy semantics on
indexing but instead will be consistent with Numarray. (The decision to
make MA differ in this regards was due to a need for CDAT to be backward
compatible with a local variant of Numeric; the CDAT user community no
longer feels this was necessary).
Some explanation about the scalar change is in order. Currently, much
coding in Numeric-based applications must be devoted to handling the
fact that after an index operation, the programmer can not assume that
the result is an array. So, what are the consequences of change? A
rank-zero array will interact as expected with most other parts of
Python. When it does not, the most likely result is a type error. For
example, let x = array([1,2,3]). Then [1,2,3][x] currently produces
the result 2. With the change, it would produce a type error unless a
change is made to the Python core (currently under discussion). But
x[x] would still work because we have control of that. In short, we
do not think this change will break much code and it will prevent the
writing of more code that is either broken or difficult to write
More information about the Numpy-discussion