# [Numpy-discussion] Cross-covariance function

Pierre Haessig pierre.haessig@crans....
Wed Feb 1 10:47:42 CST 2012

```Hi Bruce,
Sorry for the delay in the answer.

Le 27/01/2012 17:28, Bruce Southey a écrit :
> The output is still a covariance so do we really need yet another set
> of very similar functions to maintain?
> Or can we get away with a new keyword?
>
The idea of an additional keyword seems appealing.
Just to make sure I understood it well, you woud be proposing a new
signature like :
def cov(.... get_full_cov_matrix=True)

and when `get_full_cov_matrix` is set to False, only the cross
covariance part would be returned.
Am I right ?
> If speed really matters to you guys then surely moving np.cov into C
> would have more impact on 'saving the world' than this proposal. That
> also ignores algorithm used (
> http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Covariance).
>
>
nomenclature, I would say that numpy.cov is based on a vectorized
"two-pass algorithm" which computes the means first and then substracts
it before computing the matrix product. Would you make it different ?

> Actually np.cov also is deficient in that it does not have the dtype
> argument so it is prone to numerical precision errors (especially
> getting the mean of the array). Probably should be a ticket...
I'm not a specialist of numerical precisions, but I got very impressed
by the recent example raised on Jan 24th by Michael Aye which was one of
the first "real life" example I've seen.

The way I see the cov algorithm, I see first a possibility to propagate
an optional dtype argument to the mean computation.
However, I'm unsure about what to do after, for the matrix product since
"dot(X.T, X.conj()) / fact" is also a sort of mean computation.
Therefore it can also be affected by numerical precision issue. What
would you suggest ?

(the only solution I see would be to use the running variance algorithm.
Since the code wouldn't be vectorized anymore, this indeed would
benefits from going to C)

Best,
Pierre

```