[Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

Andrew Collette andrew.collette@gmail....
Fri Jan 4 11:25:07 CST 2013


Hi,

> In fact in 1.6 there is no assignment of a dtype to '1' which makes
> the way 1.6 handles it consistent with the array rules:

I guess I'm a little out of my depth here... what are the array rules?

>   # Ah-hah, it looks like '1' has a uint8 dtype:
>   (np.ones(2, dtype=np.uint8) / np.ones(2, dtype=np.uint8)).dtype == np.uint8
>   (np.ones(2, dtype=np.uint8) / 1).dtype == np.uint8
>   # But wait! No it doesn't!
>   (np.ones(2, dtype=np.int8) / np.ones(2, dtype=np.uint8)).dtype == np.int16
>   (np.ones(2, dtype=np.int8) / 1).dtype == np.int8
>   # Apparently in this case it has an int8 dtype instead.
>   (np.ones(2, dtype=np.int8) / np.ones(2, dtype=np.int8)).dtype == np.int8

Yes, this is a good point... I hadn't thought about whether it should
be unsigned or signed.  In the case of something like "1", where it's
ambiguous, couldn't we prefer the sign of the other participant in the
addition?

> interaction between these. But in 1.6, as soon as you have a uint8
> array, suddenly all the other precisions might spring magically into
> being at any moment.

I can see how this would be really annoying for someone close to the
max memory on their machine.

> So options:
> If we require that new dtypes shouldn't be suddenly introduced then we
> have to pick from:
>   1) a / 300 silently rolls over the 300 before attempting the
> operation (1.5-style)

Were people really not happy with this behavior?  My reading of this thread:

http://thread.gmane.org/gmane.comp.python.numeric.general/47986

was that the change was, although not an accident, certainly
unexpected for most people.  I don't have a strong preference either
way, but I'm interested in why we're so eager to keep the "corrected"
behavior.

>   2) a / 300 upcasts to machine precision (use the same rules for
> arrays and scalars)
>   3) a / 300 gives an error (the proposal you don't like)
>
> If we instead treat a Python scalar like 1 as having the smallest
> precision dtype that can hold its value, then we have to accept either
>   uint8 + 1 -> uint16
> or
>   int8 + 1 -> int16

Is there any consistent way we could prefer the "signedness" of the
other participant?  That would lead to both uint8 +1 -> uint8 and int8
+ 1 -> int8.

> Or there's the current code, whose behaviour no-one actually
> understands. (And I mean that both figuratively -- it's clearly
> confusing enough that people won't be able to remember it well in
> practice -- and literally -- even we developers don't know what it
> will do without running it to see.)

I agree the current behavior is confusing.  Regardless of the details
of what to do, I suppose my main objection is that, to me, it's really
unexpected that adding a number to an array could result in an
exception.

Andrew


More information about the NumPy-Discussion mailing list