[SciPy-user] Inconsistent standard deviation and variance implementation in scipy vs. scipy.stats

Johann Rohwer jr@sun.ac...
Thu Sep 25 06:39:10 CDT 2008


On Thursday, 25 September 2008, Pauli Virtanen wrote:
> The opposite direction would be completely removing `var` from
> scipy.stats. Is there a reason why the function is reimplemented in
> scipy? There's probably need eg. for float -> complex casting
> sqrt(), but I don't clearly see why there are two variants of
> `var`.
>
> Personally, I'd prefer not to have the same function reimplemented
> in two places, unless there is a clear need for it. I think there
> are more examples of duplication / signature mismatches in scipy
> vs. numpy that could be cleaned up a bit, at least in scipy.linalg.

I agree that duplicate implementations of the same function are 
confusing. 

However, within numpy itself there is further inconsistency, in that 
np.var and np.std use the "ddof" kwarg, whereas np.cov uses 
the "bias" kwarg (as do sp.stats.std and sp.stats.var). Also, default 
normalisation in np.cov is by N-1 (unbiased) wheres in np.std and 
np.var the default is by N (unbiased).

Johann


More information about the SciPy-user mailing list