[Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

Andrew Collette andrew.collette@gmail....
Fri Jan 4 11:25:07 CST 2013


> In fact in 1.6 there is no assignment of a dtype to '1' which makes
> the way 1.6 handles it consistent with the array rules:

I guess I'm a little out of my depth here... what are the array rules?

>   # Ah-hah, it looks like '1' has a uint8 dtype:
>   (np.ones(2, dtype=np.uint8) / np.ones(2, dtype=np.uint8)).dtype == np.uint8
>   (np.ones(2, dtype=np.uint8) / 1).dtype == np.uint8
>   # But wait! No it doesn't!
>   (np.ones(2, dtype=np.int8) / np.ones(2, dtype=np.uint8)).dtype == np.int16
>   (np.ones(2, dtype=np.int8) / 1).dtype == np.int8
>   # Apparently in this case it has an int8 dtype instead.
>   (np.ones(2, dtype=np.int8) / np.ones(2, dtype=np.int8)).dtype == np.int8

Yes, this is a good point... I hadn't thought about whether it should
be unsigned or signed.  In the case of something like "1", where it's
ambiguous, couldn't we prefer the sign of the other participant in the

> interaction between these. But in 1.6, as soon as you have a uint8
> array, suddenly all the other precisions might spring magically into
> being at any moment.

I can see how this would be really annoying for someone close to the
max memory on their machine.

> So options:
> If we require that new dtypes shouldn't be suddenly introduced then we
> have to pick from:
>   1) a / 300 silently rolls over the 300 before attempting the
> operation (1.5-style)

Were people really not happy with this behavior?  My reading of this thread:


was that the change was, although not an accident, certainly
unexpected for most people.  I don't have a strong preference either
way, but I'm interested in why we're so eager to keep the "corrected"

>   2) a / 300 upcasts to machine precision (use the same rules for
> arrays and scalars)
>   3) a / 300 gives an error (the proposal you don't like)
> If we instead treat a Python scalar like 1 as having the smallest
> precision dtype that can hold its value, then we have to accept either
>   uint8 + 1 -> uint16
> or
>   int8 + 1 -> int16

Is there any consistent way we could prefer the "signedness" of the
other participant?  That would lead to both uint8 +1 -> uint8 and int8
+ 1 -> int8.

> Or there's the current code, whose behaviour no-one actually
> understands. (And I mean that both figuratively -- it's clearly
> confusing enough that people won't be able to remember it well in
> practice -- and literally -- even we developers don't know what it
> will do without running it to see.)

I agree the current behavior is confusing.  Regardless of the details
of what to do, I suppose my main objection is that, to me, it's really
unexpected that adding a number to an array could result in an


More information about the NumPy-Discussion mailing list