[Numpy-discussion] Why NaN?

Bruce Southey bsouthey@gmail....
Wed Aug 5 10:20:43 CDT 2009


On 08/05/2009 09:18 AM, Keith Goodman wrote:
> On Wed, Aug 5, 2009 at 1:40 AM, Bruce Southey<bsouthey@gmail.com>  wrote:
>    
>> On Tue, Aug 4, 2009 at 4:05 PM, Keith Goodman<kwgoodman@gmail.com>  wrote:
>>      
>>> On Tue, Aug 4, 2009 at 1:53 PM, Bruce Southey<bsouthey@gmail.com>  wrote:
>>>        
>>>> On Tue, Aug 4, 2009 at 1:40 PM, Gökhan Sever<gokhansever@gmail.com>  wrote:
>>>>          
>>>>> This is the loveliest of all solutions:
>>>>>
>>>>> c[isfinite(c)].mean()
>>>>>            
>>>> This handling of nonfinite elements has come up before.
>>>> Please remember that this only for 1d or flatten array so it not work
>>>> in general especially along an axis.
>>>>          
>>> If you don't want to use nanmean from scipy.stats you could use:
>>>
>>> np.nansum(c, axis=0) / (~np.isnan(c)).sum(axis=0)
>>>
>>> or
>>>
>>> np.nansum(c, axis=0) / (c == c).sum(axis=0)
>>>
>>> But if c contains ints then you'll run into trouble with the division,
>>> so you'll need to protect against that.
>>>        
>> That is not a problem because nan and infinity are only defined for
>> floating point numbers not integers. So any array that have nonfinite
>> elements like nans and infinity must have a floating point dtype.
>>      
>
> That is true. But I was thnking of this case (no nans or infs):
>
>    
>>> c
>>>        
> array([[1, 2, 3],
>         [4, 5, 6]])
>    
>>> c.mean(0)
>>>        
>     array([ 2.5,  3.5,  4.5])<--- good
>    
>>> np.nansum(c, axis=0) / (c == c).sum(axis=0)
>>>        
>     array([2, 3, 4])<--- bad
>    
>>> np.nansum(c, axis=0) / (c == c).sum(axis=0, dtype=np.float)
>>>        
>     array([ 2.5,  3.5,  4.5])<--- good
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>    
Sure but that is about ints versus floats and not about nans or infs. 
Your 'good' examples are really about first converting an int array into 
a float array and your 'bad' example maintains int dtype (same result if 
you cast the arrays from 'good' approaches back to an int dtype).

The correct answer depends on what you want the dtype to be. For example,
With floating point division:
np.mean(c/0.0,axis=0)

gives the expected floating point answer:
array([ Inf,  Inf,  Inf])

With integer division:
np.mean(c/0,axis=0)

gives the expected integer answer:
array([ 0.,  0.,  0.])

Note the default action of mean is to convert ints to float64 which is 
why the output is a float instead of an int. Although the numpy.mean 
dtype argument does not appear to work for int dtypes.


Bruce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20090805/1bae7b92/attachment.html 


More information about the NumPy-Discussion mailing list