[Numpy-discussion] PEP 209: Multi-dimensional Arrays

Rob W. W. Hooft rob at hooft.net
Wed Feb 14 01:42:36 CST 2001


Some random PEP talk.

>>>>> "PB" == Paul Barrett <Barrett at stsci.edu> writes:

 PB> 2.  Additional array types
    
 PB> Numeric 1 has 11 defined types: char, ubyte, sbyte, short, int,
 PB> long, float, double, cfloat, cdouble, and object.  There are no
 PB> ushort, uint, or ulong types, nor are there more complex types
 PB> such as a bit type which is of use to some fields of science and
 PB> possibly for implementing masked-arrays.

True: I would have had a much easier life with a ushort type. 
    
 PB> Its relation to the other types is defined when the C-extension
 PB> module for that type is imported.  The corresponding Python code
 PB> is:
    
     >> Int32.astype[Real64] = Real64

I understand this is to be done by the Int32 C extension module. 
But how does it know about Real64?
    
 PB> Attributes:
 PB> .name:                  e.g. "Int32", "Float64", etc.
 PB> .typecode:              e.g. 'i', 'f', etc.
 PB> (for backward compatibility)

.typecode() is a method now.

 PB> .size (in bytes):       e.g. 4, 8, etc.

"element size?"

 >> add.register('add', (Int32, Int32, Int32), cfunc-add)

Typo: cfunc-add is an expression, not an identifier.

An implementation of a (Int32, Float32, Float32) add is possible and
desirable as mentioned earlier in the document. Which C module is
going to declare such a combination?

 PB> asstring():             create string from array

Not "tostring" like now?
        
 PB> 4.  ArrayView

 PB> This class is similar to the Array class except that the reshape
 PB> and flat methods will raise exceptions, since non-contiguous
 PB> arrays cannot be reshaped or flattened using just pointer and
 PB> step-size information.

This was completely unclear to me until here. I must say I find this a
strange way of handling things. I haven't looked into implementation
details, but wouldn't it feel more natural if an Array would just be
the "data", and an ArrayView would contain the dimensions and
strides. Completely separated. One would always need a pair, but more
than one ArrayView could use the same Array.

 PB> a.  _ufunc:

 PB> 1.  Does slicing syntax default to copy or view behavior?

Numeric 1 uses slicing for view, and a method for copy. "Feeling"
compatible with core python would require copy on rhs, and view on lhs
of an assignment. Is that distinction possible?

If copy is the default for slicing, how would one make a view?

 PB> 2.  Does item syntax default to copy or view behavior?

view.

 PB> Yet, c[i] can be considered just a shorthand for c[i,:] which
 PB> would imply copy behavior assuming slicing syntax returns a copy.

If you reason that way, then c is just a shorthand for c[...] too.

 PB> 3.  How is scalar coercion implemented?

 PB> Python has fewer numeric types than Numeric which can cause
 PB> coercion problems.  For example when multiplying a Python scalar
 PB> of type float and a Numeric array of type float, the Numeric array
 PB> is converted to a double, since the Python float type is actually
 PB> a double.  This is often not the desired behavior, since the
 PB> Numeric array will be doubled in size which is likely to be
 PB> annoying, particularly for very large arrays.

Sure. That is handled reasonably well by the current Numeric 1.

To extend this, I'd like to comment that I have never really understood
the philosophy of taking the largest type for coercion in all languages.
Being a scientist, I have learned that when you multiply a very accurate
number with a very approximate number, your result is going to be very
approximate, not very accurate! It would thus be more logical to have
Float32*Float64 return a Float32!

 PB> In a future version of Python, the behavior of integer division
 PB> will change.  The operands will be converted to floats, so the
 PB> result will be a float.  If we implement the proposed scalar
 PB> coercion rules where arrays have precedence over Python scalars,
 PB> then dividing an array by an integer will return an integer array
 PB> and will not be consistent with a future version of Python which
 PB> would return an array of type double.  Scientific programmers are
 PB> familiar with the distinction between integer and float-point
 PB> division, so should Numeric 2 continue with this behavior?

Numeric 2 should be as compatible as reasonably possible with core python.
But my question is: how would we do integer division of arrays? A ufunc
for which no operator shortcut exists?

 PB> 7.  How are numerical errors handled (IEEE floating-point errors in
 PB> particular)?

I am developing my code on Linux and IRIX. I have seen that where
Numeric code on Linux runs fine, the same code on IRIX may "core dump"
on a FPE (e.g. arctan2(0,0)). That difference should be avoided.

 PB> a.  Print a message of the most severe error, leaving it to
 PB> the user to locate the errors.

What is the most severe error?

 PB> c.  Minimall UFunc class:

Typo: Minimal?

Regards,

Rob Hooft
-- 
=====   rob at hooft.net          http://www.hooft.net/people/rob/  =====
=====   R&D, Nonius BV, Delft  http://www.nonius.nl/             =====
===== PGPid 0xFA19277D ========================== Use Linux! =========




More information about the Numpy-discussion mailing list