[Numpy-discussion] What should be the value of nansum of nan's?

josef.pktd@gmai... josef.pktd@gmai...
Thu Apr 29 18:52:58 CDT 2010


On Thu, Apr 29, 2010 at 12:56 PM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
>
>
> On Wed, Apr 28, 2010 at 11:56 AM, T J <tjhnson@gmail.com> wrote:
>>
>> On Mon, Apr 26, 2010 at 10:03 AM, Charles R Harris
>> <charlesr.harris@gmail.com> wrote:
>> >
>> >
>> > On Mon, Apr 26, 2010 at 10:55 AM, Charles R Harris
>> > <charlesr.harris@gmail.com> wrote:
>> >>
>> >> Hi All,
>> >>
>> >> We need to make a decision for ticket #1123 regarding what nansum
>> >> should
>> >> return when all values are nan. At some earlier point it was zero, but
>> >> currently it is nan, in fact it is nan whatever the operation is. That
>> >> is
>> >> consistent, simple and serves to mark the array or axis as containing
>> >> all
>> >> nans. I would like to close the ticket and am a bit inclined to go with
>> >> the
>> >> current behaviour although there is an argument to be made for
>> >> returning 0
>> >> for the nansum case. Thoughts?
>> >>
>> >
>> > To add a bit of context, one could argue that the results should be
>> > consistent with the equivalent operations on empty arrays and always be
>> > non-nan.
>> >
>> > In [1]: nansum([])
>> > Out[1]: nan
>> >
>> > In [2]: sum([])
>> > Out[2]: 0.0
>> >
>>
>> This seems like an obvious one to me.  What is the spirit of nansum?
>>
>> """
>>    Return the sum of array elements over a given axis treating
>>    Not a Numbers (NaNs) as zero.
>> """
>>
>> Okay.  So NaNs in an array are treated as zeros and the sum is
>> performed as one normally would perform it starting with an initial
>> sum of zero.  So if all values are NaN, then we add nothing to our
>> original sum and still return 0.
>>
>> I'm not sure I understand the argument that it should return NaN.  It
>> is counter to the *purpose* of nansum.   Also, if one wants to
>> determine if all values in an array are NaN, isn't there another way?
>> Let's keep (or make) those distinct operations, as they are definitely
>> distinct concepts.
>> __
>
> It looks like the consensus is that zero should be returned. This is a
> change from current behaviour and that bothers me a bit. Here are some other
> oddities
>
> In [6]: nanmax([nan])
> Out[6]: nan
>
> In [7]: nanargmax([nan])
> Out[7]: nan
>
> In [8]: nanargmax([1])
> Out[8]: 0
>
> So it looks like the current behaviour is very much tilted towards nans as
> missing data flags. I think we should just leave that as is with perhaps a
> note in the docs to that effect. The decision here should probably
> accommodate the current users of these functions, of which I am not one. If
> we leave the current behaviour as is then I think the rest of the nan
> functions need fixes to return nan for empty sequences as nansum is the only
> one that currently does that.

I disagree, I really would like to get nansum([Nan]) to be zero

max, min don't have a neutral element, only sum and prod have it (and
cumsum and cumprod and maybe others), so in this case nan is the
obvious answer (besides an exception):

>>> np.max([])
Traceback (most recent call last):
  File "<pyshell#124>", line 1, in <module>
    np.max([])
  File "C:\Programs\Python25\lib\site-packages\numpy\core\fromnumeric.py",
line 1765, in amax
    return _wrapit(a, 'max', axis, out)
  File "C:\Programs\Python25\lib\site-packages\numpy\core\fromnumeric.py",
line 37, in _wrapit
    result = getattr(asarray(obj),method)(*args, **kwds)
ValueError: zero-size array to ufunc.reduce without identity
>>> np.argmax([])
Traceback (most recent call last):
  File "<pyshell#125>", line 1, in <module>
    np.argmax([])
  File "C:\Programs\Python25\lib\site-packages\numpy\core\fromnumeric.py",
line 717, in argmax
    return _wrapit(a, 'argmax', axis)
  File "C:\Programs\Python25\lib\site-packages\numpy\core\fromnumeric.py",
line 37, in _wrapit
    result = getattr(asarray(obj),method)(*args, **kwds)
ValueError: attempt to get argmax/argmin of an empty sequence
>>>


>>> min([])
Traceback (most recent call last):
  File "<pyshell#126>", line 1, in <module>
    min([])
ValueError: min() arg is an empty sequence
>>> max([])
Traceback (most recent call last):
  File "<pyshell#127>", line 1, in <module>
    max([])
ValueError: max() arg is an empty sequence

Josef


>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


More information about the NumPy-Discussion mailing list