[Numpy-discussion] Question on lstsq and correlation coeff

Anthony Kong Anthony.Kong@macquarie....
Wed Feb 25 17:21:48 CST 2009

Hi, all,
It is probably a newbie question. 
I trying to use scipy/numpy in a finanical context. I want to compute
the correlation coeff of two series (returns vs index returns). I tried
two appoarches
from scipy.linalg import lstsq
coeffs,a,b,c = lstsq(matrix, returns) # matrix contains index returns
then I tried,
import numpy as np
cov = np.cov(idx1, returns)
print cov.tolist()
stddev_x = np.std(returns, ddof=1)
stddev_y = np.std(idx1, ddof=1)
print "cor = %s" % (cov.tolist()[:-1] /(stddev_x * stddev_y))

They differ from each other.
As you can see from the numpy example, I am trying to find cor coeff for
a sample. (ddof=1)
So, my question is: is the discrepency caused by the fact that I am
trying to use lstsq() on a 'sample popluation' (i.e. I am not regressing
a full return series)? Is it correct to use lstsq() this way?
Cheers, Anthony

This e-mail and any attachments are confidential and may contain copyright material of Macquarie Group Limited or third parties. If you are not the intended recipient of this email you should not read, print, re-transmit, store or act in reliance on this e-mail or any attachments, and should destroy all copies of them. Macquarie Group Limited does not guarantee the integrity of any emails or any attached files. The views or opinions expressed are the author's own and may not reflect the views or opinions of Macquarie Group Limited.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20090226/b89d1040/attachment.html 

More information about the Numpy-discussion mailing list