[Numpy-discussion] Do we want scalar casting to behave as it does at the moment?
Matthew Brett
matthew.brett@gmail....
Mon Nov 12 15:27:27 CST 2012
Hi,
On Mon, Nov 12, 2012 at 1:11 PM, Nathaniel Smith <njs@pobox.com> wrote:
> On Mon, Nov 12, 2012 at 8:54 PM, Matthew Brett <matthew.brett@gmail.com> wrote:
>> Hi,
>>
>> I wanted to check that everyone knows about and is happy with the
>> scalar casting changes from 1.6.0.
>>
>> Specifically, the rules for (array, scalar) casting have changed such
>> that the resulting dtype depends on the _value_ of the scalar.
>>
>> Mark W has documented these changes here:
>>
>> http://docs.scipy.org/doc/numpy/reference/ufuncs.html#casting-rules
>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.result_type.html
>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.promote_types.html
>>
>> Specifically, as of 1.6.0:
>>
>> In [19]: arr = np.array([1.], dtype=np.float32)
>>
>> In [20]: (arr + (2**16-1)).dtype
>> Out[20]: dtype('float32')
>>
>> In [21]: (arr + (2**16)).dtype
>> Out[21]: dtype('float64')
>>
>> In [25]: arr = np.array([1.], dtype=np.int8)
>>
>> In [26]: (arr + 127).dtype
>> Out[26]: dtype('int8')
>>
>> In [27]: (arr + 128).dtype
>> Out[27]: dtype('int16')
>>
>> There's discussion about the changes here:
>>
>> http://mail.scipy.org/pipermail/numpy-discussion/2011-September/058563.html
>> http://mail.scipy.org/pipermail/numpy-discussion/2011-March/055156.html
>> http://mail.scipy.org/pipermail/numpy-discussion/2012-February/060381.html
>>
>> It seems to me that this change is hard to explain, and does what you
>> want only some of the time, making it a false friend.
>
> The old behaviour was that in these cases, the scalar was always cast
> to the type of the array, right? So
> np.array([1], dtype=np.int8) + 256
> returned 1? Is that the behaviour you prefer?
Right. In that case of course, I'm getting something a bit nasty.
But if you're working with int8, I think you expect to be careful of
overflow. And you may well not want an automatic and maybe surprising
upcast to int16.
> I agree that the 1.6 behaviour is surprising and somewhat
> inconsistent. There are many places where you can get an overflow in
> numpy, and in all the other cases we just let the overflow happen. And
> in fact you can still get an overflow with arr + scalar operations, so
> this doesn't really fix anything.
Right - it's a half-fix, which seems to me worse than no fix.
> I find the specific handling of unsigned -> signed and float32 ->
> float64 upcasting confusing as well. (Sure, 2**16 isn't exactly
> representable as a float32, but it doesn't *overflow*, it just gives
> you 2.0**16... if I'm using float32 then I presumably don't care that
> much about exact representability, so it's surprising that numpy is
> working to enforce it, and definitely a separate decision from what to
> do about overflow.)
>
> None of those threads seem to really get into the question of what the
> best behaviour here *is*, though.
>
> Possibly the most defensible choice is to treat ufunc(arr, scalar)
> operations as performing an implicit cast of the scalar to arr's
> dtype, and using the standard implicit casting rules -- which I think
> means, raising an error if !can_cast(scalar, arr.dtype,
> casting="safe")
You mean:
In [25]: arr = np.array([1.], dtype=np.int8)
In [27]: arr + 128
ValueError - cannot safely cast 128 to array dtype int8?
That would be a major change. If I really wanted to do that, would
you then suggest I cast to an array?
arr + np.array([128])
It would be very good to make a well-argued long-term decision,
whatever the chosen outcome. Maybe this is the place for a partly
retrospective NEP?
Best,
Matthew
More information about the NumPy-Discussion
mailing list