[Numpy-discussion] Possible bug in scalar * array
Todd Miller
jmiller at stsci.edu
Tue Oct 21 16:07:16 CDT 2003
On Tue, 2003-10-21 at 15:04, Nadav Horesh wrote:
> To my opinion state 2 is surplus: Consider a large loop where an integer array s are multiplied by a wide range of scalars, and, at some point an exception is raised; It is not easy to track down what happened, especially when the scalars are not ordered (say, read from a data file). I can not find any justification for state 2 singularity (A*n is ok, A*(n+1) is not).
I see your point about the "singularity", but because of the new scalar
rules in numarray, checking for overflow seems necessary: there is a
hidden downcast from the scalar to the array type for scalar_vector and
vector_scalar operations.
> I suspect that state 3 is the fastest (it is up to you to judge), it is also consistent with the behavior of the __add__ operator.
My point was that without a reason to value consistency between the
overflow results of __add__ and __mul__, and with no guarantee that the
consistency is really obtainable anyway, why worry about it?
> Why the __add__ operator should have the risk of being nonportable and __mul__ should not?
The reason __add__ and __mul__ are treated differently is that there is
a different "probability" of overflow for each. With __mul__, overflow
is much more likely, so we deal with it and try to flag the elements
where it occurred. With __add__, we don't want to pay the significant
performance penalty of checking for overflow.
> Which state should be the default?
> There are two way to use the system: interactive and in a script. To my opinion the default should be the state that fits more the interactive mode --- slower and with a lot of checking. Maybe it is better to add a section to the documentation, how to tune the package for maximum performance: those who are interested in a high performance computing should be ready to do some extra work (read the manual, at least).
Your arguments all make sense to me Nadav, but it ultimately boils down
to what level of overflow checking we have the will to implement. Right
now, we have the will to fix the bug in the vector scalar error handling
and a simple choice of what to do with one particular new error we've
uncovered: ignore it or trap it.
Regards,
Todd
>
> Nadav.
>
> -----Original Message-----
> From: Todd Miller [mailto:jmiller at stsci.edu]
> Sent: Tue 21-Oct-03 17:12
> To: Nadav Horesh
> Cc: Edward C. Jones; numpy-discussion
> Subject: RE: [Numpy-discussion] Possible bug in scalar * array
> On Tue, 2003-10-21 at 10:07, Nadav Horesh wrote:
> > As I underoustand the range checking (from the results --- not from the source code), it checks if the range of the scalar exceeds the range of the array elements type. Don't see any significant execution time penalty with that. However there might be a place for a flag-controlled behavior:
> > * State 1: stay with the current "saturated" over/underflow whatever the scalar is. This is consistent with what numarray does now with scalars in range.
> > * State 2: Raise exception as suggested.
> > * State 3: Use the normal wrap-around on integer overflow, thus make a*2 give the same results as a+a in the following example:
> >
> > >>> a = array((100,200,128), type=UInt8)
> > >>> a+a
> > array([200, 144, 0], type=UInt8)
> > >>> a*2
> > array([200, 255, 255], type=UInt8)
> >
> > Nadav
>
> This sounds like an attempt to turn a bug fix into a coherent plan. :)
>
> We could implement what you're describing here a lot like we handle IEEE
> floating point, but I'm wondering if we should. I think state 3 is
> marginally portable, so I'm not sure we should support it, but if we
> did, what would we use it for?
>
> Similarly, if we support both states 1 and 2, is anyone going to be
> sufficiently on the ball to know the difference and set their error
> handling appropriately? Or, are 99.9% of the people going to just use
> whatever the default is? If the latter is the case, we should just
> implement "the default" and keep life simple.
>
> I'm not opposed to this if there are valid uses for it, but we should
> know those reasons before implementing, and I don't.
>
> Thanks for the ideas,
> Todd
>
> >
> > -----Original Message-----
> > From: Todd Miller [mailto:jmiller at stsci.edu]
> > Sent: Mon 20-Oct-03 21:36
> > To: Edward C. Jones
> > Cc: numpy-discussion
> > Subject: Re: [Numpy-discussion] Possible bug in scalar * array
> > I tracked down the problem to some (relatively) new overflow checking
> > code which detects the overflow of the scalar -1 as it is assigned to an
> > array pseudo buffer of type UInt8. This error was mishandled, and hence
> > was transformed into an invalid shape tuple (you gotta smile :-)). The
> > *2nd* call is where the exception shows up because of caching logic.
> >
> > I talked this over with Perry and we concluded that it's probably a good
> > thing to trap the out of range scalar values before using them. Thus,
> > we're proposing to fix the error handling, but to make the calls in
> > question raise an overflow exception on the first call. We are
> > interested in hearing other opinions however. Comments?
> >
> > Regards,
> > Todd
> >
> > On Sat, 2003-10-18 at 18:18, Edward C. Jones wrote:
> > > #! /usr/bin/env python
> > >
> > > # Python 2.3.2, numarray 0.7
> > > import numarray
> > >
> > > def fun2(code, scale):
> > > arr = numarray.ones((4,4), code)
> > > arr2 = scale * arr
> > > # Bug appears at second multiply.
> > > arr3 = scale * arr
> > >
> > > # These calls fail when "scale" is too big for "code":
> > >
> > > # File
> > > "/usr/local/lib/python2.3/site-packages/numarray/numarraycore.py", line
> > > 653, in __rmul__
> > > # def __rmul__(self, operand): return ufunc.multiply(operand, self)
> > > # ValueError: invalid shape tuple
> > >
> > > #fun2('Int16', 100000)
> > > fun2('UInt8' , -1)
> > >
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.net email sponsored by: Enterprise Linux Forum Conference & Expo
> > > The Event For Linux Datacenter Solutions & Strategies in The Enterprise
> > > Linux in the Boardroom; in the Front Office; & in the Server Room
> > > http://www.enterpriselinuxforum.com
> > > _______________________________________________
> > > Numpy-discussion mailing list
> > > Numpy-discussion at lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> > --
> > Todd Miller
> > Space Telescope Science Institute
> > 3700 San Martin Drive
> > Baltimore MD, 21030
> > (410) 338 - 4576
> >
> >
> >
> > -------------------------------------------------------
> > This SF.net email is sponsored by OSDN developer relations
> > Here's your chance to show off your extensive product knowledge
> > We want to know what you know. Tell us and you have a chance to win $100
> > http://www.zoomerang.com/survey.zgi?HRPT1X3RYQNC5V4MLNSV3E54
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
> >
> >
> >
> --
> Todd Miller
> Space Telescope Science Institute
> 3700 San Martin Drive
> Baltimore MD, 21030
> (410) 338 - 4576
>
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by OSDN developer relations
> Here's your chance to show off your extensive product knowledge
> We want to know what you know. Tell us and you have a chance to win $100
> http://www.zoomerang.com/survey.zgi?HRPT1X3RYQNC5V4MLNSV3E54
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>
>
--
Todd Miller
Space Telescope Science Institute
3700 San Martin Drive
Baltimore MD, 21030
(410) 338 - 4576
More information about the Numpy-discussion
mailing list