[SciPy-User] scipy.stats.nanstd, bias and ddof

josef.pktd@gmai... josef.pktd@gmai...
Fri Jan 15 12:48:11 CST 2010


On Fri, Jan 15, 2010 at 1:07 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
> By default np.std and scipy.std normalize by N. But scipy.stats.nanstd
> normalizes by N-1.
>
>>> x = np.random.rand(4)
>>> np.std(x)
>   0.12006913635950889
>>> scipy.std(x)
>   0.12006913635950889
>>> scipy.stats.nanstd(x)
>   0.13864389639705668
>>> scipy.stats.nanstd(x, bias=True)
>   0.12006913635950889
>
> Can the default for nanstd be changed to bias=True? Or would that break code?
>
> Even better I guess would be to replace the bias keyword with ddof as
> used in np.std and scipy.std. So
>
>    if bias:
>        m2c = m2 / n
>    else:
>        m2c = m2 / (n - 1.)
>
> in scipy.stats.nanstd would become
>
>     m2c = m2 / (n - ddof)
>
> For me it doesn't matter if the default ddof is 0 or 1. But it is nice
> when all std functions use the same default.

I agree with the consistency across function argument. But changing
the degrees of freedom will affect user code, and we would have to go
through a warning period, and maybe add the ddof argument in the
meantime. (but having both bias and ddof as arguments would be a bit
messy)

Or maybe numpy should get a nanmean and nanvar, nanstd, similar to
nansum ? Then it would be easier to depreciate like the other stats
functions that moved to numpy.

Josef

> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


More information about the SciPy-User mailing list