[Numpy-discussion] Ticket #1223...
Thu Jul 1 09:40:25 CDT 2010
On 06/29/2010 11:38 PM, David Goldsmith wrote:
> On Tue, Jun 29, 2010 at 8:16 PM, Bruce Southey <email@example.com
> <mailto:firstname.lastname@example.org>> wrote:
> On Tue, Jun 29, 2010 at 6:03 PM, David Goldsmith
> <email@example.com <mailto:firstname.lastname@example.org>> wrote:
> > On Tue, Jun 29, 2010 at 3:56 PM, <email@example.com
> <mailto:firstname.lastname@example.org>> wrote:
> >> On Tue, Jun 29, 2010 at 6:37 PM, David Goldsmith
> >> <email@example.com <mailto:firstname.lastname@example.org>> wrote:
> >> > ...concerns the behavior of numpy.random.multivariate_normal;
> if that's
> >> > of
> >> > interest to you, I urge you to take a look at the comments
> (esp. mine
> >> > :-) );
> >> > otherwise, please ignore the noise. Thanks!
> >> You should add the link to the ticket, so it's faster for
> everyone to
> >> check what you are talking about.
> >> Josef
> > Ooops! Yes I should; here it is:
> > http://projects.scipy.org/numpy/ticket/1223
> > Sorry, and thanks, Josef.
> > DG
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org <mailto:NumPy-Discussion@scipy.org>
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> As I recall, there is no requirement for the variance/covariance of
> the normal distribution to be positive definite.
> No, not positive definite, positive *semi*-definite: yes, the variance
> may be zero (the cov may have zero-valued eigenvalues), but the claim
> (and I actually am "neutral" about it, in that I wanted to reference
> the claim in the docstring and was told that doing so was unnecessary,
> the implication being that this is a "well-known" fact), is that, in
> essence (in 1-D) the variance can't be negative, which seems clear
> enough. I don't see you disputing that, and so I'm uncertain as to
> how you feel about the proposal to "weakly" enforce symmetry and
> positive *semi*-definiteness. (Now, if you dispute that even
> requiring positive *semi*-definiteness is desirable, you'll have to
> debate that w/ some of the others, because I'm taking their word for
> it that indefiniteness is "unphysical.")
> >From http://en.wikipedia.org/wiki/Multivariate_normal_distribution
> "The covariance matrix is allowed to be singular (in which case the
> corresponding distribution has no density)."
> So you must be able to draw random numbers from such a distribution.
> Obviously what those numbers really mean is another matter (I presume
> the dependent variables should be a linear function of the independent
> variables) but the user *must* know since they entered it. Since the
> function works the docstring Notes comment must be wrong.
> Imposing any restriction means that this is no longer a multivariate
> normal random number generator. If anything, you can only raise a
> warning about possible non-positive definiteness but even that will
> vary depending how it is measured and on the precision being used.
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org <mailto:NumPy-Discussion@scipy.org>
> Mathematician: noun, someone who disavows certainty when their
> uncertainty set is non-empty, even if that set has measure zero.
> Hope: noun, that delusive spirit which escaped Pandora's jar and, with
> her lies, prevents mankind from committing a general suicide. (As
> interpreted by Robert Graves)
> NumPy-Discussion mailing list
As you (and the theory) say, a variance should not be negative - yeah
right :-) In practice that is not exactly true because estimation
procedures like equating observed with expected sum of squares do lead
to negative estimates. However, that is really a failure of the model,
data and algorithm.
I think the issue is really how numpy should handle input when that
input is theoretically invalid.
I (and apparent the bug submitter) do not know what to expect if the
input is not positive definite. If the svd approach is correct for such
cases and numpy 'trusts' the user, as the usual case, then there is no
issue. If the svd approach is incorrect for such cases then that is
obviously a bug.
If numpy can not trust the user then numpy has to check and either raise
a warning or error if the input variances are greater than or equal to
zero and that the cov argument is symmetric. Replacing the SVD with
cholesky would also address these issues as both of these are checked by
numpy's cholesky function. However, cholesky() does not support
semi-positive covariance/variance input (which is possible
Also as Robert said in the thread that 'Cholesky decomposition gave an
error "too soon" in my estimation'.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion