[Numpy-discussion] Thoughts about zero dimensional arrays vs Python scalars

Colin J. Williams cjw at sympatico.ca
Sun Mar 20 10:36:20 CST 2005


Colin J. Williams wrote:

> Ralf Juengling wrote:
>
>> Travis,
>>
>> Discussing zero dimensional arrays, the PEP says at one point:
>>
>>   ... When ndarray is imported, it will alter the numeric table
>>   for python int, float, and complex to behave the same as   array 
>> objects.
>>   Thus, in the proposed solution, 0-dim arrays would never be
>>   returned from calculation, but instead, the equivalent Python
>>   Array Scalar Type.  Internally, these ArrayScalars can
>>   be quickly converted to 0-dim arrays when needed.  Each scalar
>>   would also have a method to convert to a "standard" Python Type
>>   upon request (though this shouldn't be needed often).
>>
>>
>> I'm not sure I understand this. Does it mean that, after having
>> imported ndarray, "type(1)" to "ndarray.IntArrType" rather than "int"?
>>
>> If so, I think this is a dangerous idea. There is one important
>> difference between zero dimensional arrays and Python scalar types, 
>> which is not discussed in the PEP: arrays are mutable, Python scalars 
>> are immutable.
>>
>> When Guido introduced in-place operators in Python, (+=, *=, etc.) he 
>> decided that "i += 1" should be allowed for Python
>> scalars and should mean "i = i + 1". Here you have it, it means 
>> something different when i is a mutable zero dimensional
>> array. So, I suspect a tacit re-definition of Python scalars
>> on ndarray import will break some code out there (code, that
>> does not deal with arrays at all).
>> Facing this important difference between arrays and Python
>> scalars, I'm also not sure anymore that advertising zero
>> dimensional arrays as essentially the same as Python scalars
>> is such a good idea. Perhaps it would be better not to try to
>> inherit from Python's number types and all that. Perhaps it
>> would be easier to just say that indexing an array always results in 
>> an array and that zero dimensional arrays can be converted into 
>> Python scalars. Period.
>>
>> Ralf
>>
>>
>> PS: You wrote two questions about zero dimensional arrays vs Python 
>> scalars into the PEP. What are your plans for deciding these?
>>
>>
>>  
>>
> It looks as though a decision has been made.  I was among those who 
> favoured abandoning rank-0 arrays, we lost.
>
> To my mind rank-0 arrays add complexity for little benefit and make 
> explanation more difficult.
>
> I don't spot any discussion in the PEP of the pros and cons of the nd 
> == 0 case.

A correction!
There is, in the PEP::

      Questions

      1) should sequence behavior (i.e. some combination of slicing,
      indexing, and len) be supported for 0-dim arrays?

         Pros:  It means that len(a) always works and returns the size
                of the array.  Slicing code and indexing code 
                will work for any dimension (the 0-dim array is an
                identity element for the operation of slicing)
                
         Cons:  0-dim arrays are really scalars.  They should behave
                like Python scalars which do not allow sequence behavior

      2) should array operations that result in a 0-dim array that
         is the same basic type as one of the Python scalars, return the
         Python scalar instead?

         Pros: 

               1) Some cases when Python expects an integer (the most
               dramatic is when slicing and indexing a sequence:
               _PyEval_SliceIndex in ceval.c) it will not try to
               convert it to an integer first before raising an error.
               Therefore it is convenient to have 0-dim arrays that
               are integers converted for you by the array object.

	       2) No risk of user confusion by having two types that
	       are nearly but not exactly the same and whose separate
	       existence can only be explained by the history of
	       Python and NumPy development.

               3) No problems with code that does explicit typechecks
               (isinstance(x, float) or type(x) ==
               types.FloatType). Although explicit typechecks are
               considered bad practice in general, there are a couple
               of valid reasons to use them.

               4) No creation of a dependency on Numeric in pickle
               files (though this could also be done by a special case
               in the pickling code for arrays)
         
         Cons:  It is difficult to write generic code because scalars
                do not have the same methods and attributes as arrays.
                (such as .type  or .shape).  Also Python scalars have
		different numeric behavior as well. 

                This results in a special-case checking that is not 
                pleasant.  Fundamentally it lets the user believe that 
                somehow multidimensional homoegeneous arrays
                are something like Python lists (which except for
                Object arrays they are not).

For me and for the end user, the (2) Pros win.

Colin W.






More information about the Numpy-discussion mailing list