[Numpy-discussion] Response to PEP suggestions
oliphant at ee.byu.edu
Thu Feb 17 10:53:22 CST 2005
I'm glad to get the feedback.
I like Francesc's suggestion that .typecode return a code and .type
return a Python class. What is the attitude and opinion regarding the
use of attributes or methods for
this kind of thing? It always seems to me so arbitrary as to what is an
attribute or what
is a method.
There will definitely be support for the nummary-style type
specification. Something like that will be how they print (I like the
'i4', 'f4', specification a bit better though). There will also be
support for specification in terms of a c-type. The typecodes will
still be there, underneath.
One thing has always bothered me though. Why is a double complex type
Complex64? and a float complex type Complex32. This seems to break the
idea that the number at the end specifies a bit width. Why don't we
just call it Complex64 and Complex128? Can we change this?
I'm also glad that some recognize the problems with always requiring
specification of types in terms of bit-width or byte-widths as these are
not the same across platforms. For some types (like Int8 or Int16) this
is not a problem. But what about long double? On an intel machine
long double is Float96 while on a PowerPC it is Float128. Wouldn't it
just be easier to specify LDouble or 'g' then special-case your code?
Problems also exist when you are interfacing with hardware or other C or
Fortran code. You know you want single-precision floating point. You
don't know or care what the bit-width is. I think with the Integer
types the bit-width specification is more important than floating point
types. In sum, I think it is important to have the ability to specify
it both ways. When printing the array, it's probably better if it
gives bit-width information. I like the way numarray prints arrays.
2) Multidimensional array indexing.
Sometimes it is useful to select out of an array some elements based on
it's linear (flattened) index in the array. MATLAB, for example, will
allow you to take a three-dimensional array and index it with a single
integer based on it's Fortran-order: x(1,1,1), x(2,1,1), ...
What I'm proposing would have X[K] essentially equivalent to X.flat[K].
The problem with always requiring the use of X.flat[K] is that X.flat
does not work for discontiguous arrays. It could be made to work if
X.flat returned some kind of specially-marked array, which would then
have to be checked every time indexing occurred for any array. Or,
there maybe someway to have X.flat return an "indexable iterator" for X
which may be a more Pythonic thing to do anyway. That could solve the
problem and solve the discontiguous X.flat problem as well.
If we can make X.flat[K] work for discontiguous arrays, then I would be
very happy to not special-case the single index array but always treat
it as a 1-tuple of integer index arrays.
Capping indexes was proposed because of what numarray does. I can only
think that the benefit would be that you don't have to check for and
raise an error in the middle of an indexing loop or pre-scan the
indexes. But, I suppose this is unavoidalbe, anyway. Currently Numeric
allows specifying indexes that are too high in slices. It just chops
them. Python allows this too, for slices. So, I guess I'm just
specifying Python behavior. Of course indexing with an integer that is
too large or too small will raise errors:
a = [1,2,3,4,5]
a raises an error.
3) Always returning rank-0 arrays.
This may be a bit controversial as it is a bit of a change. But, my
experience is that quite a bit of extra code is written to check whether
or not a calculation returns a Python-scalar (because these don't have
the same methods as arrays). In particular len(a) does not work if a is
a scalar, but len(b) works if b is a rank-0 array (numeric scalar).
Rank-0 arrays are scalars.
When Python needs a scalar it will generally ask the object if it can
turn itself into an int or a float. A notable exception is indexing in
a list (where Python needs an integer and won't ask the object to
convert if it can). But int(b) always returns a Python integer if the
array has only 1 element.
I'd like to know what reasons people can think of for ever returning
Python scalars unless explicitly asked for.
Thanks for the suggestions.
More information about the Numpy-discussion