[Numpy-discussion] missing data discussion round 2
Thu Jun 30 14:27:24 CDT 2011
On 06/30/2011 08:53 AM, Nathaniel Smith wrote:
> On Wed, Jun 29, 2011 at 2:21 PM, Eric Firing<email@example.com> wrote:
>> In addition, for new code, the full-blown masked array module may not be
>> needed. A convenience it adds, however, is the automatic masking of
>> invalid values:
>> In : np.ma.log(-1)
>> Out: masked
>> I'm sure this horrifies some, but there are times and places where it is
>> a genuine convenience, and preferable to having to use a separate
>> operation to replace nan or inf with NA or whatever it ends up being.
> Err, but what would this even get you? NA, NaN, and Inf basically all
> behave the same WRT floating point operations anyway, i.e., they all
Not exactly. First, it depends on np.seterr; second, calculations on NaN
can be very slow, so are better avoided entirely; third, if an array is
passed to extension code, it is much nicer if that code only has one NA
value to handle, instead of having to check for all possible "bad" values.
> Is the idea that if ufunc's gain a skipna=True flag, you'd also like
> to be able to turn it into a skipna_and_nan_and_inf=True flag?
No, it is to have a situation where skipna_and_nan_and_inf would not be
needed, because an operation generating a nan or inf would turn those
values into NA or IGNORE or whatever right away.
> -- Nathaniel
> NumPy-Discussion mailing list
More information about the NumPy-Discussion