[SciPy-user] Bias in numpy.random.multivariate_normal?
Sat Nov 10 12:55:47 CST 2007
Could someone confirm I've got this right?
In a multivariate normal (MVN) the covariance is E(X.X') - E(X).E(X').
I sample many times from a MVN with given mean and covariance. I then
calculate the covariance of the sample. This covariance matrix should
show no bias to be larger or smaller than the distribution's covariance
but I see that some entries are always larger and others are smaller.
See this code:
from numpy.random import multivariate_normal
from numpy.linalg import inv
from numpy import outer
mu = [10.,10.]
precision = [[.6, .3], [.4, .3]]
sigma = inv(precision)
sample_size = 10000
num_tests = 100
for test in xrange(num_tests):
samples = multivariate_normal(mu, sigma, [sample_size])
sample_mean = samples.sum(axis=0)/sample_size
#print (sample_mean - mu) > 0.
sample_covariance = sum(outer(sample-sample_mean, sample-sample_mean)
for sample in samples) / sample_size
print (sample_covariance - sigma) > 0
'True' is printed when the entry is larger and False otherwise
Did I get this wrong or is this acceptable behaviour for a random
variate generator or is it a bug?
More information about the SciPy-user