[Numpy-discussion] Possible bug in scalar * array

Todd Miller jmiller at stsci.edu
Wed Oct 22 08:27:04 CDT 2003


On Wed, 2003-10-22 at 10:11, Nadav Horesh wrote:
> O.K. got your case (and you certainly have one).
> 
> But ...
>   in the long run, wouldn't it be nice to give an option to remove all
> checking (and thus produce a machine dependent under/overflow treatment)
> for those who opt for speed?

Yes,  it would be nice,  but... not nice enough to "actually do it."  
If it is important enough,  eventually there will be a clamor for action
which I don't see right now.

> 
>  Really enjoyed this discussion,
> 
>     Nadav.

Me too.  Thanks for the input,

Todd

> 
> On Tue, 2003-10-21 at 22:47, Todd Miller wrote:
> > On Tue, 2003-10-21 at 15:04, Nadav Horesh wrote:
> > > To my opinion state 2 is surplus: Consider a large loop where an integer array s are multiplied by a wide range of scalars, and, at some point an exception is raised; It is not easy to track down what happened, especially when the scalars are not ordered (say, read from a data file). I can not find any justification for state 2 singularity (A*n is ok, A*(n+1) is not).
> > 
> > I see your point about the "singularity", but because of the new scalar
> > rules in numarray, checking for overflow seems necessary:  there is a
> > hidden downcast from the scalar to the array type for scalar_vector and
> > vector_scalar operations.  
> > 
> > > I suspect that state 3 is the fastest (it is up to you to judge), it is also consistent with the behavior of the __add__ operator. 
> > 
> > My point was that without a reason to value consistency between the
> > overflow results of __add__ and __mul__, and with no guarantee that the
> > consistency is really obtainable anyway,  why worry about it?
> > 
> > > Why the __add__ operator should have the risk of being nonportable and __mul__ should not?
> > 
> > The reason __add__ and __mul__ are treated differently is that there is
> > a different "probability" of overflow for each.  With __mul__, overflow
> > is much more likely, so we deal with it and try to flag the elements
> > where it occurred.  With __add__, we don't want to pay the significant
> > performance penalty of checking for overflow.
> > 
> > > Which state should be the default?
> > > There are two way to use the system: interactive and in a script. To my opinion the default should be the state that fits more the interactive mode --- slower and with a lot of checking. Maybe it is better to add a section to the documentation, how to tune the package for maximum performance: those who are interested in a high performance computing should be ready to do some extra work (read the manual, at least).
> > 
> > Your arguments all make sense to me Nadav, but it ultimately boils down
> > to what level of overflow checking we have the will to implement.  Right
> > now, we have the will to fix the bug in the vector scalar error handling
> > and a simple choice of what to do with one particular new error we've
> > uncovered: ignore it or trap it.
> > 
> > Regards,
> > Todd
> > 
> > > 
> > >   Nadav.
> > > 
> > > -----Original Message-----
> > > From:	Todd Miller [mailto:jmiller at stsci.edu]
> > > Sent:	Tue 21-Oct-03 17:12
> > > To:	Nadav Horesh
> > > Cc:	Edward C. Jones; numpy-discussion
> > > Subject:	RE: [Numpy-discussion] Possible bug in scalar * array
> > > On Tue, 2003-10-21 at 10:07, Nadav Horesh wrote:
> > > > As I underoustand the range checking (from the results --- not from the source code), it checks if the range of the scalar exceeds the range of the array elements type. Don't see any significant execution time penalty with that. However there might be a place for a flag-controlled behavior:
> > > > * State 1: stay with the current "saturated" over/underflow whatever the scalar is. This is consistent with what numarray does now with scalars in range.
> > > > * State 2: Raise exception as suggested.
> > > > * State 3: Use the normal wrap-around on integer overflow, thus make a*2 give the same results as a+a in the following example:
> > > > 
> > > > >>> a = array((100,200,128), type=UInt8)
> > > > >>> a+a
> > > > array([200, 144,   0], type=UInt8)
> > > > >>> a*2
> > > > array([200, 255, 255], type=UInt8)
> > > > 
> > > >   Nadav
> > > 
> > > This sounds like an attempt to turn a bug fix into a coherent plan. :)
> > > 
> > > We could implement what you're describing here a lot like we handle IEEE
> > > floating point,  but I'm wondering if we should.  I think state 3 is
> > > marginally portable, so I'm not sure we should support it,  but if we
> > > did, what would we use it for? 
> > > 
> > > Similarly,  if we support both states 1 and 2,  is anyone going to be
> > > sufficiently on the ball to know the difference and set their error
> > > handling appropriately?  Or, are 99.9% of the people going to just use
> > > whatever the default is?  If the latter is the case, we should just
> > > implement "the default" and keep life simple.
> > > 
> > > I'm not opposed to this if there are valid uses for it,  but we should
> > > know those reasons before implementing, and I don't.
> > > 
> > > Thanks for the ideas,
> > > Todd
> > > 
> > > > 
> > > > -----Original Message-----
> > > > From:	Todd Miller [mailto:jmiller at stsci.edu]
> > > > Sent:	Mon 20-Oct-03 21:36
> > > > To:	Edward C. Jones
> > > > Cc:	numpy-discussion
> > > > Subject:	Re: [Numpy-discussion] Possible bug in scalar * array
> > > > I tracked down the problem to some (relatively) new overflow checking
> > > > code which detects the overflow of the scalar -1 as it is assigned to an
> > > > array pseudo buffer of type UInt8.  This error was mishandled, and hence
> > > > was transformed into an invalid shape tuple (you gotta smile :-)).  The
> > > > *2nd* call is where the exception shows up because of caching logic.
> > > > 
> > > > I talked this over with Perry and we concluded that it's probably a good
> > > > thing to trap the out of range scalar values before using them.  Thus,
> > > > we're proposing to fix the error handling,  but to make the calls in
> > > > question raise an overflow exception on the first call.  We are
> > > > interested in hearing other opinions however.  Comments?
> > > > 
> > > > Regards,
> > > > Todd
> > > > 
> > > > On Sat, 2003-10-18 at 18:18, Edward C. Jones wrote:
> > > > > #! /usr/bin/env python
> > > > > 
> > > > > # Python 2.3.2, numarray 0.7
> > > > > import numarray
> > > > > 
> > > > > def fun2(code, scale):
> > > > >      arr = numarray.ones((4,4), code)
> > > > >      arr2 = scale * arr
> > > > >      # Bug appears at second multiply.
> > > > >      arr3 = scale * arr
> > > > > 
> > > > > # These calls fail when "scale" is too big for "code":
> > > > > 
> > > > > #   File 
> > > > > "/usr/local/lib/python2.3/site-packages/numarray/numarraycore.py", line 
> > > > > 653, in __rmul__
> > > > > #    def __rmul__(self, operand): return ufunc.multiply(operand, self)
> > > > > # ValueError: invalid shape tuple
> > > > > 
> > > > > #fun2('Int16', 100000)
> > > > > fun2('UInt8' , -1)
> > > > > 
> > > > > 
> > > > > 
> > > > > -------------------------------------------------------
> > > > > This SF.net email sponsored by: Enterprise Linux Forum Conference & Expo
> > > > > The Event For Linux Datacenter Solutions & Strategies in The Enterprise 
> > > > > Linux in the Boardroom; in the Front Office; & in the Server Room 
> > > > > http://www.enterpriselinuxforum.com
> > > > > _______________________________________________
> > > > > Numpy-discussion mailing list
> > > > > Numpy-discussion at lists.sourceforge.net
> > > > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> > > > -- 
> > > > Todd Miller 			
> > > > Space Telescope Science Institute
> > > > 3700 San Martin Drive
> > > > Baltimore MD, 21030
> > > > (410) 338 - 4576
> > > > 
> > > > 
> > > > 
> > > > -------------------------------------------------------
> > > > This SF.net email is sponsored by OSDN developer relations
> > > > Here's your chance to show off your extensive product knowledge
> > > > We want to know what you know. Tell us and you have a chance to win $100
> > > > http://www.zoomerang.com/survey.zgi?HRPT1X3RYQNC5V4MLNSV3E54
> > > > _______________________________________________
> > > > Numpy-discussion mailing list
> > > > Numpy-discussion at lists.sourceforge.net
> > > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> > > > 
> > > > 
> > > > 
> > > > 
> > > -- 
> > > Todd Miller 			
> > > Space Telescope Science Institute
> > > 3700 San Martin Drive
> > > Baltimore MD, 21030
> > > (410) 338 - 4576
> > > 
> > > 
> > > 
> > > -------------------------------------------------------
> > > This SF.net email is sponsored by OSDN developer relations
> > > Here's your chance to show off your extensive product knowledge
> > > We want to know what you know. Tell us and you have a chance to win $100
> > > http://www.zoomerang.com/survey.zgi?HRPT1X3RYQNC5V4MLNSV3E54
> > > _______________________________________________
> > > Numpy-discussion mailing list
> > > Numpy-discussion at lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> > > 
> > > 
> > > 
> 
-- 
Todd Miller 			
Space Telescope Science Institute
3700 San Martin Drive
Baltimore MD, 21030
(410) 338 - 4576





More information about the Numpy-discussion mailing list