[Numpy-discussion] def of var of complex

Charles R Harris charlesr.harris@gmail....
Tue Jan 8 21:20:43 CST 2008

On Jan 8, 2008 7:48 PM, Robert Kern <robert.kern@gmail.com> wrote:

> Charles R Harris wrote:
>
> > Suppose you have a set of z_i and want to choose z to minimize the
> > average square error $\sum_i |z_i - z|^2$. The solution is that
> > $z=\mean{z_i}$ and the resulting average error is given by 2). Note that
> > I didn't mention Gaussians anywhere. No distribution is needed to
> > justify the argument, just the idea of minimizing the squared distance.
> > Leaving out the ^2 would yield another metric, or one could ask for a
> > minmax solution. It is a question of the distance function, not
> > probability. Anyway, that is one justification for the approach in 2)
> > and it is one that makes a lot of applied math simple. Whether of not a
> > least squares fit is useful is different question.
>
> If you're not doing probability, then what are you using var() for? I can
> accept
> that the quantity is meaningful for your problem, but I'm not convinced
> it's a
> variance.
>

Lots of fits don't involve probability distributions. For instance, one
might want to fit a polynomial to a mathematical curve. This sort of
distinction between probability and distance goes back to Gauss himself,
although not in his original work on least squares.  Whether or not variance
implies probability is a semantic question. I think if we are going to
compute a single number,  2) is as good as anything even if it doesn't
capture the shape of the scatter plot. A 2D covariance wouldn't necessarily
capture the shape either.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20080108/93a74646/attachment.html