[Numpy-discussion] Why NaN?

Keith Goodman kwgoodman@gmail....
Wed Aug 5 09:18:17 CDT 2009


On Wed, Aug 5, 2009 at 1:40 AM, Bruce Southey<bsouthey@gmail.com> wrote:
> On Tue, Aug 4, 2009 at 4:05 PM, Keith Goodman<kwgoodman@gmail.com> wrote:
>> On Tue, Aug 4, 2009 at 1:53 PM, Bruce Southey<bsouthey@gmail.com> wrote:
>>> On Tue, Aug 4, 2009 at 1:40 PM, Gökhan Sever<gokhansever@gmail.com> wrote:
>>>> This is the loveliest of all solutions:
>>>>
>>>> c[isfinite(c)].mean()
>>>
>>> This handling of nonfinite elements has come up before.
>>> Please remember that this only for 1d or flatten array so it not work
>>> in general especially along an axis.
>>
>> If you don't want to use nanmean from scipy.stats you could use:
>>
>> np.nansum(c, axis=0) / (~np.isnan(c)).sum(axis=0)
>>
>> or
>>
>> np.nansum(c, axis=0) / (c == c).sum(axis=0)
>>
>> But if c contains ints then you'll run into trouble with the division,
>> so you'll need to protect against that.
>
> That is not a problem because nan and infinity are only defined for
> floating point numbers not integers. So any array that have nonfinite
> elements like nans and infinity must have a floating point dtype.

That is true. But I was thnking of this case (no nans or infs):

>> c
array([[1, 2, 3],
       [4, 5, 6]])
>> c.mean(0)
   array([ 2.5,  3.5,  4.5])  <--- good
>> np.nansum(c, axis=0) / (c == c).sum(axis=0)
   array([2, 3, 4])  <--- bad
>> np.nansum(c, axis=0) / (c == c).sum(axis=0, dtype=np.float)
   array([ 2.5,  3.5,  4.5])  <--- good


More information about the NumPy-Discussion mailing list