[Numpy-discussion] Response to PEP suggestions

Travis Oliphant oliphant at ee.byu.edu
Thu Feb 17 10:53:22 CST 2005


I'm glad to get the feedback.

1) Types

I like Francesc's suggestion that .typecode return a code and .type 
return a Python class.   What is the attitude and opinion regarding the 
use of attributes or methods for
this kind of thing?  It always seems to me so arbitrary as to what is an 
attribute or what
is a method.

There will definitely be support for the nummary-style type 
specification.   Something like that will be how they print (I like the 
'i4', 'f4', specification a bit better though). There will also be 
support for specification in terms of a c-type.  The typecodes will 
still be there, underneath.

One thing has always bothered me though.  Why is a double complex type 
Complex64? and a float complex type Complex32.  This seems to break the 
idea that the number at the end specifies a bit width.   Why don't we 
just call it Complex64 and Complex128?  Can we change this?

I'm also glad that some recognize the problems with always requiring 
specification of types in terms of bit-width or byte-widths as these are 
not the same across platforms.  For some types (like Int8 or Int16) this 
is not a problem.   But what about long double?  On an intel machine 
long double is Float96 while on a PowerPC it is Float128.   Wouldn't it 
just be easier to specify LDouble or 'g' then special-case your code?

Problems also exist when you are interfacing with hardware or other C or 
Fortran code.  You know you want single-precision floating point.  You 
don't know or care what the bit-width is.    I think with the Integer 
types the bit-width specification is more important than floating point 
types.  In sum, I think it is important to have the ability to specify 
it both ways.   When printing the array, it's probably better if it 
gives bit-width information.  I like the way numarray prints arrays.


2) Multidimensional array indexing.

Sometimes it is useful to select out of an array some elements based on 
it's linear (flattened) index in the array.   MATLAB, for example, will 
allow you to take a three-dimensional array and index it with a single 
integer based on it's Fortran-order:  x(1,1,1),  x(2,1,1), ...

What I'm proposing would have X[K] essentially equivalent to X.flat[K].  
The problem with always requiring the use of X.flat[K] is that X.flat 
does not work for discontiguous arrays.   It could be made to work if 
X.flat returned some kind of specially-marked array, which would then 
have to be checked every time indexing occurred for any array.  Or, 
there maybe someway to have X.flat return an "indexable iterator" for X 
which may be a more Pythonic thing to do anyway.  That could solve the 
problem and solve the discontiguous X.flat problem as well. 

If we can make X.flat[K] work for discontiguous arrays, then I would be 
very happy to not special-case the single index array but always treat 
it as a 1-tuple of integer index arrays.

Capping indexes was proposed because of what numarray does.   I can only 
think that the benefit would be that you don't have to check for and 
raise an error in the middle of an indexing loop or pre-scan the 
indexes.  But, I suppose this is unavoidalbe, anyway.  Currently Numeric 
allows specifying indexes that are too high in slices. It just chops 
them.  Python allows this too, for slices.  So, I guess I'm just 
specifying Python behavior.  Of course indexing with an integer that is 
too large or too small will raise errors:

In Python:

a = [1,2,3,4,5]
a[:20]   works
a[20] raises an error.


3)  Always returning rank-0 arrays.

This may be a bit controversial as it is a bit of a change.  But, my 
experience is that quite a bit of extra code is written to check whether 
or not a calculation returns a Python-scalar (because these don't have 
the same methods as arrays).  In particular len(a) does not work if a is 
a scalar, but len(b) works if b is a rank-0 array (numeric scalar).   
Rank-0 arrays are scalars.  

When Python needs a scalar it will generally ask the object if it can 
turn itself into an int or a float.   A notable exception is indexing in 
a list (where Python needs an integer and won't ask the object to 
convert if it can).  But int(b) always returns a Python integer if the 
array has only 1 element. 

I'd like to know what reasons people can think of for ever returning 
Python scalars  unless explicitly asked for.


Thanks for the suggestions.

-Travis









More information about the Numpy-discussion mailing list