[Numpy-discussion] var bias reason?

Paul Barrett pebarrett@gmail....
Wed Oct 15 10:31:44 CDT 2008


I'm behind Travis on this one.

 -- Paul

On Wed, Oct 15, 2008 at 11:19 AM, David Cournapeau <cournape@gmail.com> wrote:
> On Wed, Oct 15, 2008 at 11:45 PM, Travis E. Oliphant
> <oliphant@enthought.com> wrote:
>> Gabriel Gellner wrote:
>>> Some colleagues noticed that var uses biased formula's by default in numpy,
>>> searching for the reason only brought up:
>>>
>>> http://article.gmane.org/gmane.comp.python.numeric.general/12438/match=var+bias
>>>
>>> which I totally agree with, but there was no response? Any reason for this?
>> I will try to respond to this as it was me who made the change.  I think
>> there have been responses, but I think I've preferred to stay quiet
>> rather than feed a flame war.   Ultimately, it is a matter of preference
>> and I don't think there would be equal weights given to all the
>> arguments surrounding the decision by everybody.
>>
>> I will attempt to articulate my reasons:  dividing by n is the maximum
>> likelihood estimator of variance and I prefer that justification more
>> than the "un-biased" justification for a default (especially given that
>> bias is just one part of the "error" in an estimator).    Having every
>> package that computes the mean return the "un-biased" estimate gives it
>> more cultural weight than than the concept deserves, I think.  Any
>> surprise that is created by the different default should be mitigated by
>> the fact that it's an opportunity to learn something about what you are
>> doing.    Here is a paper I wrote on the subject that you might find
>> useful:
>>
>> https://contentdm.lib.byu.edu/cdm4/item_viewer.php?CISOROOT=/EER&CISOPTR=134&CISOBOX=1&REC=1
>> (Hopefully, they will resolve a link problem at the above site soon, but
>> you can read the abstract).
>
> Yes, I hope too, I would be happy to read the article.
>
> On the limit of unbiasdness, the following document mentions an
> example (in a different context than variance estimation):
>
> http://www.stat.columbia.edu/~gelman/research/published/badbayesresponsemain.pdf
>
> AFAIK, even statisticians who consider themselves as "mostly
> frequentist" (if that makes any sense) do not advocate unbiasdness as
> such an important concept anymore (Larry Wasserman mentions it in his
> "all of statistics").
>
> cheers,
>
> David
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>


More information about the Numpy-discussion mailing list