[Numpy-discussion] nan_to_num and bool arrays

Keith Goodman kwgoodman@gmail....
Fri Dec 11 18:38:24 CST 2009


On Fri, Dec 11, 2009 at 4:06 PM, Robert Kern <robert.kern@gmail.com> wrote:
> On Fri, Dec 11, 2009 at 17:44, Keith Goodman <kwgoodman@gmail.com> wrote:
>> On Fri, Dec 11, 2009 at 2:22 PM, Robert Kern <robert.kern@gmail.com> wrote:
>>> On Fri, Dec 11, 2009 at 16:09, Keith Goodman <kwgoodman@gmail.com> wrote:
>>>> On Fri, Dec 11, 2009 at 1:14 PM, Robert Kern <robert.kern@gmail.com> wrote:
>>>>> On Fri, Dec 11, 2009 at 14:41, Keith Goodman <kwgoodman@gmail.com> wrote:
>>>>>> On Fri, Dec 11, 2009 at 12:08 PM, Bruce Southey <bsouthey@gmail.com> wrote:
>>>>>
>>>>>>> So I agree that it should leave the input untouched when a non-float
>>>>>>> dtype is used for some array-like input.
>>>>>>
>>>>>> Would only one line need to be changed? Would changing
>>>>>>
>>>>>> if not issubclass(t, _nx.integer):
>>>>>>
>>>>>> to
>>>>>>
>>>>>> if not issubclass(t, _nx.integer) and not issubclass(t, _nx.bool_):
>>>>>>
>>>>>> do the trick?
>>>>>
>>>>> That still leaves strings, voids, and objects. I recommend:
>>>>>
>>>>>  if issubclass(t, _nx.inexact):
>>>>>
>>>>> Arguably, one should handle nan float objects in object arrays and
>>>>> float columns in structured arrays, but the current code does not
>>>>> handle either of those anyways.
>>>>
>>>> Without your change both
>>>>
>>>>>> np.nan_to_num(np.array([True, False]))
>>>>>> np.nan_to_num([1])
>>>>
>>>> raise exceptions. With your change:
>>>>
>>>>>> np.nan_to_num(np.array([True, False]))
>>>>   array([ True, False], dtype=bool)
>>>>>> np.nan_to_num([1])
>>>>   array([1])
>>>
>>> I think this is correct, though the latter one happens by accident.
>>> Lists don't have a .dtype attribute so obj2sctype(type([1])) is
>>> checked and happens to be object_. The latter line is intended to
>>> handle scalars, not sequences. I think that sequences should be
>>> coerced to arrays for output and this check should be more explicit
>>> about what it handles. [1.0] will have a problem if you don't.
>>
>> That makes sense. But I'm not smart enough to implement it.
>
> Something like the following at the top should help distinguish the
> various cases.:
>
> is_scalar = False
> if not isinstance(x, _nx.ndarray):
>    x = np.asarray(x)
>    if x.shape == ():
>        # Must return this as a scalar later.
>        is_scalar = True
> old_shape = x.shape
> if x.shape == ():
>    # We need element access.
>    x.shape = (1,)
> t = x.dtype.type
>
> This should allow one to pass in [np.inf] and have it correctly get
> interpreted as a float array rather than an object scalar.

That seems to work. To avoid changing the input

>> x = np.array(1)
>> x.shape
   ()
>> y = nan_to_num(x)
>> x.shape
   (1,)

I moved y = x.copy() further up and switched x's to y's. Here's what
it looks like:

def nan_to_num(x):
    is_scalar = False
    if not isinstance(x, _nx.ndarray):
       x = asarray(x)
       if x.shape == ():
           # Must return this as a scalar later.
           is_scalar = True
    y = x.copy()
    old_shape = y.shape
    if y.shape == ():
       # We need element access.
       y.shape = (1,)
    t = y.dtype.type
    if issubclass(t, _nx.complexfloating):
        return nan_to_num(y.real) + 1j * nan_to_num(y.imag)
    if issubclass(t, _nx.inexact):
        are_inf = isposinf(y)
        are_neg_inf = isneginf(y)
        are_nan = isnan(y)
        maxf, minf = _getmaxmin(y.dtype.type)
        y[are_nan] = 0
        y[are_inf] = maxf
        y[are_neg_inf] = minf
    if is_scalar:
        y = y[0]
    else:
        y.shape = old_shape
    return y


More information about the NumPy-Discussion mailing list