[Numpy-discussion] weighted mean; weighted standard error of the mean (sem)
Fri Sep 10 12:58:08 CDT 2010
Interesting. Thanks Erin, Josef and Keith.
There is a nice article on this at
http://www.stata.com/support/faqs/stat/supweight.html. In my case, the
model I've in mind is to assume that the expected value (mean) is the same
for each sample, and that the weights are/should be normalised, whence a
consistent estimator for sem is straightforward (if second moments can
be assumed to be
well behaved?). I suspect that this (survey-like) case is also one of
the two most standard/most common
expression that people want when they ask for an s.e. of the mean for
a weighted dataset. The other would be when the weights are not to be
normalised, but represent standard errors on the individual
Surely what one wants, in the end, is a single function (or whatever)
called mean or sem which calculates different values for different
specified choices of model (assumptions)? And where possible that it has a
default model in mind for when none is specified?
On Thu, Sep 9, 2010 at 9:13 PM, Keith Goodman <firstname.lastname@example.org> wrote:
> >>>> ma.std()
> >> 3.2548815339711115
> > or maybe `w` reflects an underlying sampling scheme and you should
> > sample in the bootstrap according to w ?
> > if weighted average is a sum of linear functions of (normal)
> > distributed random variables, it still depends on whether the
> > individual observations have the same or different variances, e.g.
> > http://en.wikipedia.org/wiki/Weighted_mean#Statistical_properties
> ...lots of possibilities. As you have shown the problem is not yet
> well defined. Not much specification needed for the weighted mean,
> lots needed for the standard error of the weighted mean.
> > What I can't figure out is whether if you assume simga_i = sigma for
> > all observation i, do we use the weighted or the unweighted variance
> > to get an estimate of sigma. And I'm not able to replicate with simple
> > calculations what statsmodels.WLS gives me.
> My guess: if all you want is sigma of the individual i and you know
> sigma is the same for all i, then I suppose you don't care about the
> > ???
> > Josef
More information about the NumPy-Discussion