[NumPy-Tickets] [NumPy] #1223: numpy.random.multivariate_normal accepts indefinite covariance matrices (was: need errors for non-physical numpy.random.multivariate_normal covariance matrices)

NumPy Trac numpy-tickets@scipy....
Tue Jun 29 17:33:59 CDT 2010


#1223: numpy.random.multivariate_normal accepts indefinite covariance matrices
-------------------------------------------------------+--------------------
 Reporter:  zero79                                     |       Owner:  somebody
     Type:  defect                                     |      Status:  new     
 Priority:  normal                                     |   Milestone:          
Component:  numpy.random                               |     Version:          
 Keywords:  multivariate normal covariance indefinite  |  
-------------------------------------------------------+--------------------
Changes (by dgoldsmith):

 * cc: d_l_goldsmith@… (added)
  * keywords:  => multivariate normal covariance indefinite


Comment:

 Let me be more succinct: the problem is not that the code doesn't run, the
 problem is that the code *does* run!  Here's a much, much simpler example:
 {{{
 >>> import numpy as N
 >>> N.version.version
 '1.4.1'
 >>> from numpy import random as R
 >>> R.multivariate_normal((1,1), ((1,0),(0,-1))) # this should *not* run
 array([ 0.67651334,  0.44764747])
 }}}
 This shouldn't run because the covariance matrix is not "non-negative
 definite" (aka, positive semidefinite).

 I discovered this bug independently while working on this function's
 docstring which states at the end of the Notes (i.e., almost as an
 afterthought):

 "Note that the covariance matrix must be non-negative definite."

 I thought it odd that such a requirement should be relegated to the Notes
 and not be included in the discussion of the cov function parameter, so I
 checked the code and found no "enforcement," so I checked the behavior,
 and found that, sure enough, the function *is* all too "happy" to accept
 input which the Notes state it "must" not be fed.

 Now, I can see the reason - speed: my first call of this function w/ a
 two-component integer mean and a very simple integer-valued cov resulted
 in a brief but noticeable delay in being provided the answer; is this due
 to the use of svd, which a note in the code says should be replaced by
 Cholesky decomp.? - why we may want to leave it up to the user to enforce
 positive semidefiniteness (PSDness), but IMHO that's a dangerous road to
 go down, and if we do go down it, then the warning sign needs to be
 featured much more prominently, i.e., in the Extended summary and again in
 the cov Parameter description.

 However, a better alternative, IMO, is to add code that checks for PSDness
 and raises a warning if it's detected - that should be the default
 behavior, but we can add a keyword parameter which allows the user to by-
 pass that check if they want to.

-- 
Ticket URL: <http://projects.scipy.org/numpy/ticket/1223#comment:3>
NumPy <http://projects.scipy.org/numpy>
My example project


More information about the NumPy-Tickets mailing list