[Numpy-discussion] strange behavior of numpy.random.multivariate_normal, ticket:1842
Thu Feb 16 10:47:51 CST 2012
On Thu, Feb 16, 2012 at 11:30 AM, <email@example.com> wrote:
> On Thu, Feb 16, 2012 at 11:20 AM, Warren Weckesser
> <firstname.lastname@example.org> wrote:
>> On Thu, Feb 16, 2012 at 10:12 AM, Pierre Haessig <email@example.com>
>>> Le 16/02/2012 16:20, firstname.lastname@example.org a écrit :
>>>> I don't see any way to fix multivariate_normal for this case, except
>>>> for dropping svd or for random perturbing a covariance matrix with
>>>> multiplicity of singular values.
>>> I just made a quick search in what R guys are doing. It happens there are
>>> several codes (http://cran.r-project.org/web/views/Multivariate.html ). For
>>> instance, mvtnorm
>>> (http://cran.r-project.org/web/packages/mvtnorm/index.html). I've attached
>>> the related function from the source code of this package.
>>> Interestingly enough, it seems they provide 3 different methods (svd,
>>> eigen values, and Cholesky).
>>> I don't have the time now to dive in the assessments of pros and cons of
>>> those three. Maybe one works for our problem, but I didn't check yet.
>> For some alternatives to numpy's multivariate_normal, see
>> http://www.scipy.org/Cookbook/CorrelatedRandomSamples. Both versions
>> (Cholesky and eigh) are just a couple lines of code.
> Thanks both,
> The main point is that it is a "Needs decision"
> Robert argued several times on the mailing list why he chose svd.
> (with svd covariance can be closer to singular then with cholesky)
> In statsmodels we usually just use Cholesky for similar
> transformation, and I use occasionally an eigh version. (I need to
> look up the thread but I got puzzled about results with eig and
> multiplicity of eigenvalues before.)
> The R code is GPL, but the few lines of code look standard without any
> special provision for non-deterministic linear algebra.
> If multivariate_normal switches from svd to cholesky or eigh, we still
> need to check that we don't run into similar "determinacy" problems
> with numpy's linalg (I think in statsmodels we use mostly scipy, so I
> don't know.)
np.linalg.eigh always produces the same eigenvectors, both running
repeatedly in the same session and running the script several times on
the command line.
so eigh looks good as alternative to svd for this case, I don't know
if we buy numerical problems in other corner cases, but for near
singularity it's always possible to check the smallest eigenvalue
>> NumPy-Discussion mailing list
More information about the NumPy-Discussion