[Numpy-discussion] var bias reason?
Wed Oct 15 10:31:44 CDT 2008
I'm behind Travis on this one.
On Wed, Oct 15, 2008 at 11:19 AM, David Cournapeau <email@example.com> wrote:
> On Wed, Oct 15, 2008 at 11:45 PM, Travis E. Oliphant
> <firstname.lastname@example.org> wrote:
>> Gabriel Gellner wrote:
>>> Some colleagues noticed that var uses biased formula's by default in numpy,
>>> searching for the reason only brought up:
>>> which I totally agree with, but there was no response? Any reason for this?
>> I will try to respond to this as it was me who made the change. I think
>> there have been responses, but I think I've preferred to stay quiet
>> rather than feed a flame war. Ultimately, it is a matter of preference
>> and I don't think there would be equal weights given to all the
>> arguments surrounding the decision by everybody.
>> I will attempt to articulate my reasons: dividing by n is the maximum
>> likelihood estimator of variance and I prefer that justification more
>> than the "un-biased" justification for a default (especially given that
>> bias is just one part of the "error" in an estimator). Having every
>> package that computes the mean return the "un-biased" estimate gives it
>> more cultural weight than than the concept deserves, I think. Any
>> surprise that is created by the different default should be mitigated by
>> the fact that it's an opportunity to learn something about what you are
>> doing. Here is a paper I wrote on the subject that you might find
>> (Hopefully, they will resolve a link problem at the above site soon, but
>> you can read the abstract).
> Yes, I hope too, I would be happy to read the article.
> On the limit of unbiasdness, the following document mentions an
> example (in a different context than variance estimation):
> AFAIK, even statisticians who consider themselves as "mostly
> frequentist" (if that makes any sense) do not advocate unbiasdness as
> such an important concept anymore (Larry Wasserman mentions it in his
> "all of statistics").
> Numpy-discussion mailing list
More information about the Numpy-discussion