[Numpy-discussion] Question on lstsq and correlation coeff

Anthony Kong Anthony.Kong@macquarie....
Wed Feb 25 17:21:48 CST 2009

Hi, all,
It is probably a newbie question. 
I trying to use scipy/numpy in a finanical context. I want to compute
the correlation coeff of two series (returns vs index returns). I tried
two appoarches
from scipy.linalg import lstsq
coeffs,a,b,c = lstsq(matrix, returns) # matrix contains index returns
then I tried,
import numpy as np
cov = np.cov(idx1, returns)
print cov.tolist()
stddev_x = np.std(returns, ddof=1)
stddev_y = np.std(idx1, ddof=1)
print "cor = %s" % (cov.tolist()[:-1] /(stddev_x * stddev_y))

They differ from each other.
As you can see from the numpy example, I am trying to find cor coeff for
a sample. (ddof=1)
So, my question is: is the discrepency caused by the fact that I am
trying to use lstsq() on a 'sample popluation' (i.e. I am not regressing
a full return series)? Is it correct to use lstsq() this way?
Cheers, Anthony

