[NumPy-Tickets] [NumPy] #2031: numpy.cov behaviour change when called with 2 arrays (cross-covariance)

NumPy Trac numpy-tickets@scipy....
Thu Jan 26 07:12:04 CST 2012

#2031: numpy.cov behaviour change when called with 2 arrays (cross-covariance)
 Reporter:  pierreh      |       Owner:  somebody   
     Type:  enhancement  |      Status:  new        
 Priority:  normal       |   Milestone:  Unscheduled
Component:  numpy.lib    |     Version:  1.6.1      
 Keywords:               |  
 Following up a discussion on Numpy-discussion ML, I propose a behaviour
 change in numpy.cov function, when called with 2 arrays. cov should only
 compute the *cross*-covariance of the two arrays

 Today's behaviour description:
 cov(X,Y) is just an equivalent of cov(concatenate(X,Y)).
 The output can be written in terms of blocks :
 1/N * (X*X.T X*Y.T)
       (Y*X.T Y*Y.T)

 Therefore, 3 blocks out of 4 are often useless, since diagonal blocks are
 simply cov(X) and cov(Y) while the off-diagonal blocks are transpose of
 each other : only one them is needed.

 This brings two issues :
  * today's behavior is somehow useless since cov(concatenate(X,Y)) is
  * it consumes 3 times as many resources as what is needed.

 Proposed behaviour :
 the output of cov(X,Y) should only be one of the cross-covariance block:
 1/N * (X*Y.T)

 This would put numpy in line with R's behaviour

 Also, cov(X,X) would become equivalent to cov(X)

 Implementation skeleton:

 def cov(X, Y=None):
     if Y is None:
         Y = X
           assert Y.shape == X.shape # or something like that
     # [...jumping to the end of the existing code...]
     if not rowvar:
         return (dot(X.T, Y.conj()) / fact).squeeze()
         return (dot(X, Y.T.conj()) / fact).squeeze()

Ticket URL: <http://projects.scipy.org/numpy/ticket/2031>
NumPy <http://projects.scipy.org/numpy>
My example project

More information about the NumPy-Tickets mailing list