[Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

Olivier Delalleau shish@keba...
Thu Jan 17 19:34:13 CST 2013

2013/1/17 Matthew Brett <matthew.brett@gmail.com>:
> Hi,
> On Fri, Jan 18, 2013 at 1:04 AM, Chris Barker - NOAA Federal
> <chris.barker@noaa.gov> wrote:
>> On Thu, Jan 17, 2013 at 6:26 AM, Matthew Brett <matthew.brett@gmail.com> wrote:
>>> I am starting to wonder if we should aim for making
>>> * scalar and array casting rules the same;
>>> * Python int / float scalars become int32 / 64 or float64;
>> aren't they already? I'm not sure what you are proposing.
> Sorry - yes that is what they are already, this sentence refers back
> to an earlier suggestion of mine on the thread, which I am discarding.
>>> This has the benefit of being very easy to understand and explain.  It
>>> makes dtypes predictable in the sense they don't depend on value.
>> That is key -- I don't think casting should ever depend on value.
>>> Those wanting to maintain - say - float32 will need to cast scalars to float32.
>>> Maybe the use-cases motivating the scalar casting rules - maintaining
>>> float32 precision in particular - can be dealt with by careful casting
>>> of scalars, throwing the burden onto the memory-conscious to maintain
>>> their dtypes.
>> IIRC this is how it worked "back in the day" (the Numeric day? -- and
>> I'm pretty sure that in the long run it worked out badly. the core
>> problem is that there are only python literals for a couple types, and
>> it was oh so easy to do things like:
>> my_arr = np,zeros(shape, dtype-float32)
>> another_array = my_array * 4.0
>> and you'd suddenly get a float64 array. (of course, we already know
>> all that..) I suppose this has the up side of being safe, and having
>> scalar and array casting rules be the same is of course appealing, but
>> you use a particular size dtype for a reason,and it's a real pain to
>> maintain it.
> Yes, I do understand that.  The difference - as I understand it - is
> that back in the day, numeric did not have the the float32 etc
> scalars, so you could not do:
> another_array = my_array * np.float32(4.0)
> (please someone correct me if I'm wrong).
>> Casual users will use the defaults that match the Python types anyway.
> I think what we are reading in this thread is that even experienced
> numpy users can find the scalar casting rules surprising, and that's a
> real problem, it seems to me.
> The person with a massive float32 array certainly should have the
> ability to control upcasting, but I think the default should be the
> least surprising thing, and that, it seems to me, is for the casting
> rules to be the same for arrays and scalars.   In the very long term.

That would also be my preference, after banging my head against this
problem for a while now, because it's simple and consistent.

Since most of the related issues seem to come from integer arrays, a
middle-ground may be the following:
- Integer-type arrays get upcasted by scalars as in usual array /
array operations.
- Float/Complex-type arrays don't get upcasted by scalars except when
the scalar is complex and the array is float.

It makes the rule a bit more complex, but has the advantage of better
preserving float types while getting rid of most issues related to
integer overflows.

-=- Olivier

More information about the NumPy-Discussion mailing list