[Numpy-discussion] abs for max negative integers - desired behavior?
Frédéric Bastien
nouiz@nouiz....
Tue Oct 18 13:34:38 CDT 2011
What about a parameter that allow to select the option the user want?
it would select between uint, upcasted_int, -MAX and +MAX. This way,
at least it will be documented and user who care will have the choose.
Personally, when the option is available, I would prefer the safe
version, uint, but I understand that is not all people position.
Frédéric Bastien
On Sat, Oct 15, 2011 at 3:00 PM, Matthew Brett <matthew.brett@gmail.com> wrote:
> Hi,
>
> On Wed, Oct 12, 2011 at 8:31 AM, David Cournapeau <cournape@gmail.com> wrote:
>> On 10/12/11, "V. Armando Solé" <sole@esrf.fr> wrote:
>>> On 12/10/2011 10:46, David Cournapeau wrote:
>>>> On Wed, Oct 12, 2011 at 9:18 AM, "V. Armando Solé" wrote:
>>>>> From a pure user perspective, I would not expect the abs function to
>>>>> return a negative number. Returning +127 plus a warning the first time
>>>>> that happens seems to me a good compromise.
>>>> I guess the question is what's the common context to use small
>>>> integers in the first place. If it is to save memory, then upcasting
>>>> may not be the best solution. I may be wrong, but if you decide to use
>>>> those types in the first place, you need to know about overflows. Abs
>>>> is just one of them (dividing by -1 is another, although this one
>>>> actually raises an exception).
>>>>
>>>> Detecting it may be costly, but this would need benchmarking.
>>>>
>>>> That being said, without context, I don't find 127 a better solution than
>>>> -128.
>>>
>>> Well that choice is just based on getting the closest positive number to
>>> the true value (128). The context can be anything, for instance you
>>> could be using a look up table based on the result of an integer
>>> operation ...
>>>
>>> In terms of cost, it would imply to evaluate the cost of something like:
>>>
>>> a = abs(x);
>>> if (a < 0) {a -= MIN_INT;}
>>> return a;
>>
>> Yes, this is costly: it adds a branch to a trivial operation. I did
>> some preliminary benchmarks (would need confirmation when I have more
>> than one minute to spend on this):
>>
>> int8, 2**16 long array. Before check: 16 us. After check: 92 us. 5-6
>> times slower
>> int8, 2**24 long array. Before check: 20ms. After check: 30ms. 30 % slower.
>>
>> There is also the issue of signaling the error in the ufunc machinery.
>> I forgot whether this is possible at that level.
>
> I suppose that returning the equivalent uint type would be of zero cost though?
>
> I don't think the problem should be relegated to 'people should know
> about this' because this a problem for any signed integer type, and it
> can lead to nasty errors which people are unlikely to test for.
>
> See you,
>
> Matthew
