[Numpy-discussion] Response to PEP suggestions
David M. Cooke
cookedm at physics.mcmaster.ca
Thu Feb 17 13:32:19 CST 2005
Travis Oliphant <oliphant at ee.byu.edu> writes:
> I'm glad to get the feedback.
> 1) Types
> I like Francesc's suggestion that .typecode return a code and .type
> return a Python class. What is the attitude and opinion regarding
> the use of attributes or methods for
> this kind of thing? It always seems to me so arbitrary as to what is
> an attribute or what
> is a method.
If it's an intrinisic attribute (heh) of the object, I usually try to
make it an attribute. So I'd make these attributes.
> There will definitely be support for the nummary-style type
> specification. Something like that will be how they print (I like
> the 'i4', 'f4', specification a bit better though). There will also be
> support for specification in terms of a c-type. The typecodes will
> still be there, underneath.
+1. I think labelling types with their sizes at some level is necessary
for cross-platform compatibility (more below).
> One thing has always bothered me though. Why is a double complex type
> Complex64? and a float complex type Complex32. This seems to break
> the idea that the number at the end specifies a bit width. Why don't
> we just call it Complex64 and Complex128? Can we change this?
Or rename to ComplexFloat32 and ComplexFloat64?
> I'm also glad that some recognize the problems with always requiring
> specification of types in terms of bit-width or byte-widths as these
> are not the same across platforms. For some types (like Int8 or
> Int16) this is not a problem. But what about long double? On an
> intel machine long double is Float96 while on a PowerPC it is
> Float128. Wouldn't it just be easier to specify LDouble or 'g' then
> special-case your code?
One problem to consider (and where I first ran into these type of
things) is when pickling. A pickle containing an array of Int isn't
portable, if the two machines have a different idea of what an Int is
(Int32 or Int64, for instance). Another reason to keep the byte-width.
LDouble, for instance, should probably be an alias to Float96 on
Intel, and Float128 on PPC, and pickle accordingly.
> Problems also exist when you are interfacing with hardware or other C
> or Fortran code. You know you want single-precision floating point.
> You don't know or care what the bit-width is. I think with the
> Integer types the bit-width specification is more important than
> floating point types. In sum, I think it is important to have the
> ability to specify it both ways. When printing the array, it's
> probably better if it gives bit-width information. I like the way
> numarray prints arrays.
Do you mean adding bit-width info to str()? repr() definitely needs
it, and it should be included in all cases, I think.
You also run into that sizeof(Python integer) isn't necessarily
sizeof(C int) (a Python int being a C long), espically on 64-bit systems.
I come from a C background, so things like Float64, etc., look wrong.
I think more in terms of single- and double-precision, so I think
adding some more descriptive types:
CInt (would be either Int32 or Int64, depending on the platform)
CFloat (can't do Float, for backwards-compatibility reasons)
CDouble (could just be Double)
CLong (or Long)
CLongLong (or LongLong)
That could make it easier to match types in Python code to types in C
Oh, and the Python types int and float should be allowed (especially
if you want this to go in the core!).
And a Fortran integer could be something else, but I think that's
more of a SciPy problem than Numeric or numarray. It could add
FInteger and FBoolean, for instance.
> 2) Multidimensional array indexing.
> Sometimes it is useful to select out of an array some elements based
> on it's linear (flattened) index in the array. MATLAB, for example,
> will allow you to take a three-dimensional array and index it with a
> single integer based on it's Fortran-order: x(1,1,1), x(2,1,1), ...
> What I'm proposing would have X[K] essentially equivalent to
> X.flat[K]. The problem with always requiring the use of X.flat[K] is
> that X.flat does not work for discontiguous arrays. It could be made
> to work if X.flat returned some kind of specially-marked array, which
> would then have to be checked every time indexing occurred for any
> array. Or, there maybe someway to have X.flat return an "indexable
> iterator" for X which may be a more Pythonic thing to do anyway. That
> could solve the problem and solve the discontiguous X.flat problem as
> If we can make X.flat[K] work for discontiguous arrays, then I would
> be very happy to not special-case the single index array but always
> treat it as a 1-tuple of integer index arrays.
Right now, I find X.flat to be pretty useless, as you need a
contiguous array. I'm +1 on making X.flat work in all cases (contiguous
and discontiguous). Either
a) X.flat returns a contiguous 1-dimensional array (like ravel(X)),
which may be a copy of X
b) X.flat returns a "flat-indexable" view of X
I'd argue for b), as I feel that attributes should operate as views,
not as potential copies. To me, attributes "feel like" they do no
work, so making a copy by mere dereferencing would be suprising.
If a), I'd rather flat() be a method (or have a ravel() method).
I think overloading X[K] starts to run into trouble: too many special
|David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca
More information about the Numpy-discussion