# [NumPy-Tickets] [NumPy] #2031: numpy.cov behaviour change when called with 2 arrays (cross-covariance)

NumPy Trac numpy-tickets@scipy....
Thu Jan 26 07:12:04 CST 2012

```#2031: numpy.cov behaviour change when called with 2 arrays (cross-covariance)
-------------------------+--------------------------------------------------
Reporter:  pierreh      |       Owner:  somebody
Type:  enhancement  |      Status:  new
Priority:  normal       |   Milestone:  Unscheduled
Component:  numpy.lib    |     Version:  1.6.1
Keywords:               |
-------------------------+--------------------------------------------------
Following up a discussion on Numpy-discussion ML, I propose a behaviour
change in numpy.cov function, when called with 2 arrays. cov should only
compute the *cross*-covariance of the two arrays

Today's behaviour description:
cov(X,Y) is just an equivalent of cov(concatenate(X,Y)).
The output can be written in terms of blocks :
1/N * (X*X.T X*Y.T)
(Y*X.T Y*Y.T)

Therefore, 3 blocks out of 4 are often useless, since diagonal blocks are
simply cov(X) and cov(Y) while the off-diagonal blocks are transpose of
each other : only one them is needed.

This brings two issues :
* today's behavior is somehow useless since cov(concatenate(X,Y)) is
equivalent
* it consumes 3 times as many resources as what is needed.

Proposed behaviour :
the output of cov(X,Y) should only be one of the cross-covariance block:
1/N * (X*Y.T)

This would put numpy in line with R's behaviour
(http://stat.ethz.ch/R-manual/R-patched/library/stats/html/cor.html)

Also, cov(X,X) would become equivalent to cov(X)

Implementation skeleton:

def cov(X, Y=None):
if Y is None:
Y = X
else:
assert Y.shape == X.shape # or something like that
# [...jumping to the end of the existing code...]
if not rowvar:
return (dot(X.T, Y.conj()) / fact).squeeze()
else:
return (dot(X, Y.T.conj()) / fact).squeeze()

--
Ticket URL: <http://projects.scipy.org/numpy/ticket/2031>
NumPy <http://projects.scipy.org/numpy>
My example project
```