[Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

Matthew Brett matthew.brett@gmail....
Mon Jan 21 16:46:55 CST 2013


Hi,

On Sun, Jan 20, 2013 at 6:10 PM, Olivier Delalleau <shish@keba.be> wrote:
> 2013/1/18 Matthew Brett <matthew.brett@gmail.com>:
>> Hi,
>>
>> On Fri, Jan 18, 2013 at 7:58 PM, Chris Barker - NOAA Federal
>> <chris.barker@noaa.gov> wrote:
>>> On Fri, Jan 18, 2013 at 4:39 AM, Olivier Delalleau <shish@keba.be> wrote:
>>>> Le vendredi 18 janvier 2013, Chris Barker - NOAA Federal a écrit :
>>>
>>>> If you check again the examples in this thread exhibiting surprising /
>>>> unexpected behavior, you'll notice most of them are with integers.
>>>> The tricky thing about integers is that downcasting can dramatically change
>>>> your result. With floats, not so much: you get approximation errors (usually
>>>> what you want) and the occasional nan / inf creeping in (usally noticeable).
>>>
>>> fair enough.
>>>
>>> However my core argument is that people use non-standard (usually
>>> smaller) dtypes for a reason, and it should be hard to accidentally
>>> up-cast.
>>>
>>> This is in contrast with the argument that accidental down-casting can
>>> produce incorrect results, and thus it should be hard to accidentally
>>> down-cast -- same argument whether the incorrect results are drastic
>>> or not....
>>>
>>> It's really a question of which of these we think should be prioritized.
>>
>> After thinking about it for a while, it seems to me Olivier's
>> suggestion is a good one.
>>
>> The rule becomes the following:
>>
>> array + scalar casting is the same as array + array casting except
>> array + scalar casting does not upcast floating point precision of the
>> array.
>>
>> Am I right (Chris, Perry?) that this deals with almost all your cases?
>>  Meaning that it is upcasting of floats that is the main problem, not
>> upcasting of (u)ints?
>>
>> This rule seems to me not very far from the current 1.6 behavior; it
>> upcasts more - but the dtype is now predictable.  It's easy to
>> explain.  It avoids the obvious errors that the 1.6 rules were trying
>> to avoid.  It doesn't seem too far to stretch to make a distinction
>> between rules about range (ints) and rules about precision (float,
>> complex).
>>
>> What do you'all think?
>
> Personally, I think the main issue with my suggestion is that it seems
> hard to go there from the current behavior -- without potentially
> breaking existing code in non-obvious ways. The main problematic case
> I foresee is the typical "small_int_array + 1", which would get
> upcasted while it wasn't the case before (neither in 1.5 nor in 1.6).
> That's why I think Nathaniel's proposal is more practical.

It's important to establish the behavior we want in the long term,
because it will likely affect the stop-gap solution we choose now.

For example, let's say we think that the 1.5 behavior is desired in
the long term - in that case Nathaniel's solution seems good (although
it will change behavior from 1.6.x)

If we think that your suggestion is preferable for the long term,
sticking with 1.6. behavior is more attractive.

It seems to me we need the use-cases laid out properly in order to
decide, at the moment we are working somewhat blind, at least in my
opinion.

Cheers,

Matthew


More information about the NumPy-Discussion mailing list