[Numpy-discussion] Nasty bug using pre-initialized arrays

Robert Kern robert.kern@gmail....
Fri Jan 4 19:36:30 CST 2008


Zachary Pincus wrote:
> Hello all,
> 
>> That's well and good.  But NumPy should *never* automatically -- and
>> silently -- chop the imaginary part off your complex array elements,
>> particularly if you are just doing an innocent assignment!
>> Doing something drastic like silently throwing half your data away can
>> lead to all kinds of bugs in code written by somebody who is unaware
>> of this behavior (i.e. most people)!
>>
>> It sounds to me like the right thing is to throw an exception instead
>> of "downcasting" a data object.
> 
> I'm not sure that I agree! I'd rather not have to litter my code with  
> "casting" operations every time I wanted to down-convert data types  
> (creating yet more temporary arrays...) via assignment. e.g.:
> 
> A[i] = calculate(B).astype(A.dtype)
> vs.
> A[i] = calculate(B)
> 
> Further, writing functions to operate on generic user-provided output  
> arrays (or arrays of user-provided dtype; both of which are common  
> e.g. in scipy.ndimage) becomes more bug-prone, as every assignment  
> would need to be modified as above.
> 
> This change would break a lot of the image-processing code I've  
> written (where one often does computations in float or even double,  
> and then re-scales and down-converts the result to integer for  
> display), for example.
> 
> I guess that this could be rolled in via the geterr/seterr mechanism,  
> and could by default just print warnings. I agree that silent  
> truncation can be troublesome, but not having to spell every  
> conversion out in explicit ( and error-prone) detail is also pretty  
> useful. (And arguably more pythonic.)

There's some confusion in the conversation here. Tim already identified it, but
since it's continuing, I'm going to reiterate.

There are two related hierarchies of datatypes: different kinds (integer,
floating point, complex floating point) and different precisions within a given
kind (int8, int16, int32, int64). The term "downcasting" should probably be
reserved for the latter only.

It seems to me that Zach and Scott are defending downcasting of precisions
within a given kind. It does not necessarily follow that the behavior we choose
for dtypes within a given kind should be the behavior when we are dealing with
dtypes across different kinds. We can keep the precision downcasting behavior
that you want while raising an error when one attempts to assign a complex
number into a floating point array.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


More information about the Numpy-discussion mailing list