[Numpy-discussion] Bug in numpy std, etc. with other data structures?

Robert Kern robert.kern@gmail....
Sat Sep 17 22:20:07 CDT 2011


On Sat, Sep 17, 2011 at 22:11, Bruce Southey <bsouthey@gmail.com> wrote:
> On Sat, Sep 17, 2011 at 10:00 PM, Wes McKinney <wesmckinn@gmail.com> wrote:
>> On Sat, Sep 17, 2011 at 10:50 PM, Bruce Southey <bsouthey@gmail.com> wrote:
>>> On Sat, Sep 17, 2011 at 4:12 PM, Wes McKinney <wesmckinn@gmail.com> wrote:
>>>> On Sat, Sep 17, 2011 at 4:48 PM, Skipper Seabold <jsseabold@gmail.com> wrote:
>>>>> Just ran into this. Any objections for having numpy.std and other
>>>>> functions in core/fromnumeric.py call asanyarray before trying to use
>>>>> the array's method? Other data structures like pandas and larry define
>>>>> their own std method, for instance, and this doesn't allow them to
>>>>> pass through. I'm inclined to say that the issue is with numpy, though
>>>>> maybe the data structures shouldn't shadow numpy array methods while
>>>>> altering the signature. I dunno.
>>>>>
>>>>> df = pandas.DataFrame(np.random.random((10,5)))
>>>>>
>>>>> np.std(df,axis=0)
>>>>> <snip>
>>>>> TypeError: std() got an unexpected keyword argument 'dtype'
>>>>>
>>>>> np.std(np.asanyarray(df),axis=0)
>>>>> array([ 0.30883352,  0.3133324 ,  0.26517361,  0.26389029,  0.20022444])
>>>>>
>>>>> Though I don't think this would work with larry yet.
>>>>>
>>>>> Pull request: https://github.com/numpy/numpy/pull/160
>>>>>
>>>>> Skipper
>>>>> _______________________________________________
>>>>> NumPy-Discussion mailing list
>>>>> NumPy-Discussion@scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>> numpy.std()  does accepts array-like which obvious means that
>>> np.std([1,2,3,5]) works making asanyarray call a total waste of cpu
>>> time. Clearly pandas is not array-like input (as Wes points out below)
>>> so an error is correct. Doing this type of 'fix' will have unintended
>>> consequences when other non-numpy objects are incorrectly passed to
>>> numpy functions. Rather you should determine why 'array-like' failed
>>> here IF you think a pandas object is either array-like or a numpy
>>> object.
>>
>> No, the reason it is failing is because np.std takes the
>> EAFP/duck-typing approach:
>>
>> try:
>>    std = a.std
>> except AttributeError:
>>    return _wrapit(a, 'std', axis, dtype, out, ddof)
>> return std(axis, dtype, out, ddof)
>>
>> Indeed DataFrame has an std method but it doesn't have the same
>> function signature as ndarray.std.
>
> Thanks for the clarification - see Robert I am not making things up!

I have no doubt that np.std() fails to work as desired. But the fault
is with np.std() not living up to the semantics implied by the
documentation (or the documentation documenting the wrong semantics),
not that DataFrame does not live up to a meaning of "array-like" that
you invented.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


More information about the NumPy-Discussion mailing list