[Numpy-discussion] Do we want scalar casting to behave as it does at the moment?
Mon Nov 12 15:34:59 CST 2012
On Mon, Nov 12, 2012 at 10:27 PM, Matthew Brett <firstname.lastname@example.org> wrote:
> On Mon, Nov 12, 2012 at 1:11 PM, Nathaniel Smith <email@example.com> wrote:
>> On Mon, Nov 12, 2012 at 8:54 PM, Matthew Brett <firstname.lastname@example.org> wrote:
>>> I wanted to check that everyone knows about and is happy with the
>>> scalar casting changes from 1.6.0.
>>> Specifically, the rules for (array, scalar) casting have changed such
>>> that the resulting dtype depends on the _value_ of the scalar.
>>> Mark W has documented these changes here:
>>> Specifically, as of 1.6.0:
>>> In : arr = np.array([1.], dtype=np.float32)
>>> In : (arr + (2**16-1)).dtype
>>> Out: dtype('float32')
>>> In : (arr + (2**16)).dtype
>>> Out: dtype('float64')
>>> In : arr = np.array([1.], dtype=np.int8)
>>> In : (arr + 127).dtype
>>> Out: dtype('int8')
>>> In : (arr + 128).dtype
>>> Out: dtype('int16')
>>> There's discussion about the changes here:
>>> It seems to me that this change is hard to explain, and does what you
>>> want only some of the time, making it a false friend.
>> The old behaviour was that in these cases, the scalar was always cast
>> to the type of the array, right? So
>> np.array(, dtype=np.int8) + 256
>> returned 1? Is that the behaviour you prefer?
> Right. In that case of course, I'm getting something a bit nasty.
> But if you're working with int8, I think you expect to be careful of
> overflow. And you may well not want an automatic and maybe surprising
> upcast to int16.
>> I agree that the 1.6 behaviour is surprising and somewhat
>> inconsistent. There are many places where you can get an overflow in
>> numpy, and in all the other cases we just let the overflow happen. And
>> in fact you can still get an overflow with arr + scalar operations, so
>> this doesn't really fix anything.
> Right - it's a half-fix, which seems to me worse than no fix.
>> I find the specific handling of unsigned -> signed and float32 ->
>> float64 upcasting confusing as well. (Sure, 2**16 isn't exactly
>> representable as a float32, but it doesn't *overflow*, it just gives
>> you 2.0**16... if I'm using float32 then I presumably don't care that
>> much about exact representability, so it's surprising that numpy is
>> working to enforce it, and definitely a separate decision from what to
>> do about overflow.)
>> None of those threads seem to really get into the question of what the
>> best behaviour here *is*, though.
>> Possibly the most defensible choice is to treat ufunc(arr, scalar)
>> operations as performing an implicit cast of the scalar to arr's
>> dtype, and using the standard implicit casting rules -- which I think
>> means, raising an error if !can_cast(scalar, arr.dtype,
> You mean:
> In : arr = np.array([1.], dtype=np.int8)
> In : arr + 128
> ValueError - cannot safely cast 128 to array dtype int8?
> That would be a major change. If I really wanted to do that, would
> you then suggest I cast to an array?
> arr + np.array()
No, that will upcast, I think. (It should, anyway -- scalars are a
special case.) Maybe you meant np.array(, dtype=np.int8)?
Anyway, you'd cast to an int8 scalar:
arr + np.int8(128)
(or a scalar array, np.array(128, dtype=np.int8), I think would also
count as a scalar for these purposes?)
I don't see how this would be *that* major a change, the change in 1.6
(which silently changed the meaning of people's code) is larger, I
would say :-).
> It would be very good to make a well-argued long-term decision,
> whatever the chosen outcome. Maybe this is the place for a partly
> retrospective NEP?
More information about the NumPy-Discussion