[Numpy-discussion] the mean, var, std of empty arrays

josef.pktd@gmai... josef.pktd@gmai...
Wed Nov 21 21:58:47 CST 2012


On Wed, Nov 21, 2012 at 10:35 PM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
>
>
> On Wed, Nov 21, 2012 at 7:45 PM, <josef.pktd@gmail.com> wrote:
>>
>> On Wed, Nov 21, 2012 at 9:22 PM, Olivier Delalleau <shish@keba.be> wrote:
>> > Current behavior looks sensible to me. I personally would prefer no
>> > warning
>> > but I think it makes sense to have one as it can be helpful to detect
>> > issues
>> > faster.
>>
>> I agree that nan should be the correct answer.
>> (I gave up trying to define a default for 0/0 in scipy.stats ttests.)
>>
>> some funnier cases
>>
>> >>> np.var([1], ddof=1)
>> 0.0
>
>
> This one is a nan in development.
>
>>
>> >>> np.var([1], ddof=5)
>> -0
>> >>> np.var([1,2], ddof=5)
>> -0.16666666666666666
>> >>> np.std([1,2], ddof=5)
>> nan
>>
>
> These still do this. Also
>
> In [10]: var([], ddof=1)
> Out[10]: -0
>
> Which suggests that the nan is pretty much an accidental byproduct of
> division by zero. I think it might make sense to have a definite policy for
> these corner cases.

It would also be consistent with the usual pattern to raise a
ValueError on this. ddof too large, size too small.
It wouldn't be the case that for some columns or rows we get valid
answers in this case, as long as we don't allow for missing values.


quick check with np.ma

looks correct except when delegating to numpy ?

>>> s = np.ma.var(np.ma.masked_invalid([[1.,2],[1,np.nan]]), ddof=5, axis=0)
>>> s
masked_array(data = [-- --],
             mask = [ True  True],
       fill_value = 1e+20)

>>> s = np.ma.var(np.ma.masked_invalid([[1.,2],[1,np.nan]]), ddof=1, axis=0)
>>> s
masked_array(data = [0.0 --],
             mask = [False  True],
       fill_value = 1e+20)

>>> s = np.ma.std([1,2], ddof=5)
>>> s
masked
>>> type(s)
<class 'numpy.ma.core.MaskedConstant'>

>>> np.ma.var([1,2], ddof=5)
-0.16666666666666666


Josef

>
> <snip>
>
> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


More information about the NumPy-Discussion mailing list