[Numpy-discussion] warning or error for non-physical multivariate_normal covariance matrices?
Charles R Harris
charlesr.harris@gmail....
Tue Sep 15 22:11:25 CDT 2009
On Tue, Sep 15, 2009 at 12:26 PM, Robert Kern <robert.kern@gmail.com> wrote:
> On Tue, Sep 15, 2009 at 12:50, Charles R
> Harris<charlesr.harris@gmail.com> wrote:
> >
> >
> > On Tue, Sep 15, 2009 at 11:38 AM, Michael Gilbert
> > <michael.s.gilbert@gmail.com> wrote:
> >>
> >> hi,
> >>
> >> when using numpy.random.multivariate_normal, would it make sense to warn
> >> the user that they have entered a non-physical covariance matrix? i was
> >> recently working on a problem and getting very strange results until i
> >> finally realized that i had actually entered a bogus covariance matrix.
> >>
> >> its easy to determine when this is the case -- its when the
> >> determinant of the covariance matrix is negative. i.e. the
> >> multivariate normal distribution has det(C)^1/2 as part of the
> >> normalization factor, so when det(C)<0, you end up with an imaginary
> >> probability distribution.
> >>
> >
> > Hmm, you mean it isn't implemented using a cholesky decomposition? That
> > would (should) throw an error if the covariance isn't symmetric positive
> > definite.
>
> We use the SVD to do the matrix square root. I believe I was just
> following the older code that I was replacing. I have run into nearly
> degenerate cases where det(C) ~ 0 such that the SVD method gave not
> unreasonable answers, given the circumstances, while the Cholesky
> decomposition gave an error "too soon" in my estimation.
>
>
That's a bit dangerous for ill conditioned covariance matrices because the
orthogonal matrices aren't guaranteed to be transposes of each other. In
particular, if there are negative eigenvalues, due to roundoff error or
incorrect user input, the square root is guaranteed to fail because the
singular values have to be positive and this failure will pass unnoticed.
It's possible that where cholesky failed the svd method was actually giving
you wrong results. In any case eigh would be a safer choice if you don't
want to use cholesky.
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20090915/9eb20d8a/attachment.html
More information about the NumPy-Discussion
mailing list