[Numpy-discussion] np.nan and ``is``
Fri Sep 19 12:52:17 CDT 2008
Alan G Isaac wrote:
> Might someone explain this to me?
> >>> x = [1.,np.nan]
> >>> np.nan in x
> >>> np.nan in np.array(x)
> >>> np.nan in np.array(x).tolist()
> >>> np.nan is float(np.nan)
not quite -- but I do know that "is" is tricky -- it tests object
identity. I think it actually compares the pointer to the object. What
makes this tricky is that python interns some objects, so that when you
create two that have the same value, they may actually be the same object:
>>> s1 = "this"
>>> s2 = "this"
>>> s1 is s2
So short strings are interned, as are small integers and maybe floats?
However, longer strings are not:
>>> s1 = "A much longer string"
>>> s2 = "A much longer string"
>>> s1 is s2
I don't know the interning rules, but I do know that you should never
count on them, then may not be consistent between implementations, or
even different runs.
NaN is a floating point number with a specific value. np.nan is
particular instance of that, but not all nans will be the same instance:
>>> np.array(0.0) / 0
>>> np.array(0.0) / 0 is np.nan
So you can't use "is" to check.
>>> np.array(0.0) / 0 == np.nan
and you can't use "=="
The only way to do it reliably is:
>>> np.isnan(np.array(0.0) / 0)
So, the short answer is that the only way to deal with NaNs properly is
to have NaN-aware functions, like nanmin() and friends.
Regardless of how man nan* functions get written, or what exactly they
do, we really do need to make sure that no numpy function gives bogus
results in the presence of NaNs, which doesn't appear to be the case now.
I also think I see a consensus building that non-nan-specific numpy
functions should either preserve NaN's or raise exceptions, rather than
Christopher Barker, Ph.D.
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
More information about the Numpy-discussion