[Numpy-discussion] Why NaN?
Pierre GM
pgmdevlist@gmail....
Wed Aug 5 14:11:46 CDT 2009
<cough> And, er... masked arrays anyone ? </cough>
On Aug 5, 2009, at 11:20 AM, Bruce Southey wrote:
> On 08/05/2009 09:18 AM, Keith Goodman wrote:
>>
>> On Wed, Aug 5, 2009 at 1:40 AM, Bruce Southey<bsouthey@gmail.com>
>> wrote:
>>
>>> On Tue, Aug 4, 2009 at 4:05 PM, Keith Goodman<kwgoodman@gmail.com>
>>> wrote:
>>>
>>>> On Tue, Aug 4, 2009 at 1:53 PM, Bruce Southey<bsouthey@gmail.com>
>>>> wrote:
>>>>
>>>>> On Tue, Aug 4, 2009 at 1:40 PM, Gökhan
>>>>> Sever<gokhansever@gmail.com> wrote:
>>>>>
>>>>>> This is the loveliest of all solutions:
>>>>>>
>>>>>> c[isfinite(c)].mean()
>>>>>>
>>>>> This handling of nonfinite elements has come up before.
>>>>> Please remember that this only for 1d or flatten array so it not
>>>>> work
>>>>> in general especially along an axis.
>>>>>
>>>> If you don't want to use nanmean from scipy.stats you could use:
>>>>
>>>> np.nansum(c, axis=0) / (~np.isnan(c)).sum(axis=0)
>>>>
>>>> or
>>>>
>>>> np.nansum(c, axis=0) / (c == c).sum(axis=0)
>>>>
>>>> But if c contains ints then you'll run into trouble with the
>>>> division,
>>>> so you'll need to protect against that.
>>>>
>>> That is not a problem because nan and infinity are only defined for
>>> floating point numbers not integers. So any array that have
>>> nonfinite
>>> elements like nans and infinity must have a floating point dtype.
>>>
>>
>> That is true. But I was thnking of this case (no nans or infs):
>>
>>
>>>> c
>>>>
>> array([[1, 2, 3],
>> [4, 5, 6]])
>>
>>>> c.mean(0)
>>>>
>> array([ 2.5, 3.5, 4.5]) <--- good
>>
>>>> np.nansum(c, axis=0) / (c == c).sum(axis=0)
>>>>
>> array([2, 3, 4]) <--- bad
>>
>>>> np.nansum(c, axis=0) / (c == c).sum(axis=0, dtype=np.float)
>>>>
>> array([ 2.5, 3.5, 4.5]) <--- good
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
> Sure but that is about ints versus floats and not about nans or
> infs. Your 'good' examples are really about first converting an int
> array into a float array and your 'bad' example maintains int dtype
> (same result if you cast the arrays from 'good' approaches back to
> an int dtype).
>
> The correct answer depends on what you want the dtype to be. For
> example,
> With floating point division:
> np.mean(c/0.0,axis=0)
>
> gives the expected floating point answer:
> array([ Inf, Inf, Inf])
>
> With integer division:
> np.mean(c/0,axis=0)
>
> gives the expected integer answer:
> array([ 0., 0., 0.])
>
> Note the default action of mean is to convert ints to float64 which
> is why the output is a float instead of an int. Although the
> numpy.mean dtype argument does not appear to work for int dtypes.
>
>
> Bruce
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
More information about the NumPy-Discussion
mailing list