[NumPy-Tickets] [NumPy] #1223: numpy.random.multivariate_normal accepts indefinite covariance matrices (was: need errors for non-physical numpy.random.multivariate_normal covariance matrices)
NumPy Trac
numpy-tickets@scipy....
Tue Jun 29 17:33:59 CDT 2010
#1223: numpy.random.multivariate_normal accepts indefinite covariance matrices
-------------------------------------------------------+--------------------
Reporter: zero79 | Owner: somebody
Type: defect | Status: new
Priority: normal | Milestone:
Component: numpy.random | Version:
Keywords: multivariate normal covariance indefinite |
-------------------------------------------------------+--------------------
Changes (by dgoldsmith):
* cc: d_l_goldsmith@… (added)
* keywords: => multivariate normal covariance indefinite
Comment:
Let me be more succinct: the problem is not that the code doesn't run, the
problem is that the code *does* run! Here's a much, much simpler example:
{{{
>>> import numpy as N
>>> N.version.version
'1.4.1'
>>> from numpy import random as R
>>> R.multivariate_normal((1,1), ((1,0),(0,-1))) # this should *not* run
array([ 0.67651334, 0.44764747])
}}}
This shouldn't run because the covariance matrix is not "non-negative
definite" (aka, positive semidefinite).
I discovered this bug independently while working on this function's
docstring which states at the end of the Notes (i.e., almost as an
afterthought):
"Note that the covariance matrix must be non-negative definite."
I thought it odd that such a requirement should be relegated to the Notes
and not be included in the discussion of the cov function parameter, so I
checked the code and found no "enforcement," so I checked the behavior,
and found that, sure enough, the function *is* all too "happy" to accept
input which the Notes state it "must" not be fed.
Now, I can see the reason - speed: my first call of this function w/ a
two-component integer mean and a very simple integer-valued cov resulted
in a brief but noticeable delay in being provided the answer; is this due
to the use of svd, which a note in the code says should be replaced by
Cholesky decomp.? - why we may want to leave it up to the user to enforce
positive semidefiniteness (PSDness), but IMHO that's a dangerous road to
go down, and if we do go down it, then the warning sign needs to be
featured much more prominently, i.e., in the Extended summary and again in
the cov Parameter description.
However, a better alternative, IMO, is to add code that checks for PSDness
and raises a warning if it's detected - that should be the default
behavior, but we can add a keyword parameter which allows the user to by-
pass that check if they want to.
--
Ticket URL: <http://projects.scipy.org/numpy/ticket/1223#comment:3>
NumPy <http://projects.scipy.org/numpy>
My example project
More information about the NumPy-Tickets
mailing list