# [Numpy-discussion] Cross-covariance function

Bruce Southey bsouthey@gmail....
Fri Jan 27 10:28:53 CST 2012

```On 01/27/2012 09:00 AM, Benjamin Root wrote:
>
>
> On Friday, January 27, 2012, Pierre Haessig <pierre.haessig@crans.org
> <mailto:pierre.haessig@crans.org>> wrote:
> > Le 26/01/2012 19:19, josef.pktd@gmail.com
> <mailto:josef.pktd@gmail.com> a écrit :
> >> The discussion had this reversed, numpy matches the behavior of
> >> MATLAB, while R (statistics) only returns the cross covariance part as
> >> proposed.
> >>
> > I would also say that there was an attempt to match MATLAB behavior.
> > However, there is big difference with numpy.cov because of the default
> > value `rowvar` being True. Most softwares and textbooks I know consider
> > that, in a 2D context, matrix rows are obvervations while columns are
> > the variables.
> >
> > Any idea why the "transposed" convention was selected in np.cov ?
> > (This question, I'm raising for informative purpose only... ;-) )
> >
> > I also compared with octave to see how it works :
> > -- Function File: cov (X, Y)
> > Compute covariance.
> >
> > If each row of X and Y is an observation and each column is a
> > variable, the (I, J)-th entry of `cov (X, Y)' is the covariance
> > between the I-th variable in X and the J-th variable in Y. If
> > called with one argument, compute `cov (X, X)'.
> >
> >
> (http://www.gnu.org/software/octave/doc/interpreter/Correlation-and-Regression-Analysis.html)
> > I like the clear tone of this description. But strangely enough, this a
> > bit different from Matlab.
> >
> >
> >> If there is a new xcov, then I think there should also be a xcorrcoef.
> >> This case needs a different implementation than corrcoef, since the
> >> xcov doesn't contain the variances and they need to be calculated
> >> separately.
> > Adding xcorrcoeff as well would make sense. The use of the np.var when
> > setting the `axis` and `??ddof` arguments to appropriate values
> should the
> > bring variances needed for the normalization.
> >
> > In the end, if adding xcov is the path of least resistance, this may be
> > the way to go. What do people think ?
> >
> > Pierre
> >
>
> My vote is for xcov() and xcorrcoeff(). It won't break compatibility,
> and the name of the function makes it clear what it does. It would
> also make sense to add "seealso" references to each other in the
> docstrings.  The documentation for xcov() should also make it clear
> the differences between cov() and xcov() with examples and show how to
> get equivalent results using just cov() for those with older versions
> of numpy.
>
> Ben Root
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
-1 because these are too close to cross-correlation as used by signal
processing.

The output is still a covariance so do we really need yet another set of
very similar functions to maintain?
Or can we get away with a new keyword?

If speed really matters to you guys then surely moving np.cov into C
would have more impact on 'saving the world' than this proposal. That
also ignores algorithm used (
http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Covariance).

Actually np.cov also is deficient in that it does not have the dtype
argument so it is prone to numerical precision errors (especially
getting the mean of the array). Probably should be a ticket...

Bruce
-------------- next part --------------
An HTML attachment was scrubbed...