[Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

Peter Cock p.j.a.cock@googlemail....
Thu Jan 3 18:39:50 CST 2013


On Fri, Jan 4, 2013 at 12:11 AM, Dag Sverre Seljebotn
<d.s.seljebotn@astro.uio.no> wrote:
> On 01/04/2013 12:39 AM, Andrew Collette wrote:
> > Nathaniel Smith wrote:
> >> Consensus in that bug report seems to be that for array/scalar operations like:
> >>    np.array([1], dtype=np.int8) + 1000 # can't be represented as an int8!
> >> we should raise an error, rather than either silently upcasting the
> >> result (as in 1.6 and 1.7) or silently downcasting the scalar (as in
> >> 1.5 and earlier).
> >
> > I have run into this a few times as a NumPy user, and I just wanted to
> > comment that (in my opinion), having this case generate an error is
> > the worst of both worlds.  The reason people can't decide between
> > rollover and promotion is because neither is objectively better.  One
>
> If neither is objectively better, I think that is a very good reason to
> kick it down to the user. "Explicit is better than implicit".
>
> > avoids memory inflation, and the other avoids losing precision.  You
> > just need to pick one and document it.  Kicking the can down the road
> > to the user, and making him/her explicitly test for this condition, is
> > not a very good solution.
>
> It's a good solution to encourage bug-free code. It may not be a good
> solution to avoid typing.
>
> > What does this mean in practical terms for NumPy users?  I personally
> > don't relish the choice of always using numpy.add, or always wrapping
> > my additions in checks for ValueError.
>
> I think you usually have a bug in your program when this happens, since
> either the dtype is wrong, or the value one is trying to store is wrong.
> I know that's true for myself, though I don't claim to know everybody
> elses usecases.

I agree with Dag rather than Andrew, "Explicit is better than implicit".
i.e. What Nathaniel described earlier as the apparent consensus.

Since I've actually used NumPy arrays with specific low memory
types, I thought I should comment about my use case if case it
is helpful:

I've only used the low precision types like np.uint8 (unsigned) where
I needed to limit my memory usage. In this case, the topology of a
graph allowing multiple edges held as an integer adjacency matrix, A.
I would calculate things like A^n for paths of length n, and also make
changes to A directly (e.g. adding edges). So an overflow was always
possible, and neither the old behaviour (type preserving but wrapping
on overflow giving data corruption) nor the current behaviour (type
promotion overriding my deliberate memory management) are nice.
My preferences here would be for an exception, so I knew right away.

The other use case which comes to mind is dealing with low level
libraries and/or file formats, and here automagic type promotion
would probably be unwelcome.

Regards,

Peter


More information about the NumPy-Discussion mailing list