[Numpy-discussion] Question on lstsq and correlation coeff

Anthony Kong Anthony.Kong@macquarie....
Wed Feb 25 18:31:37 CST 2009


 Hi, Josef,

Thanks very much for the quick and helpful response.

Could you also comment on the use of lstsq(): Why it leads to inconsistent result?

Cheers, Anthony

-----Original Message-----
From: numpy-discussion-bounces@scipy.org [mailto:numpy-discussion-bounces@scipy.org] On Behalf Of josef.pktd@gmail.com
Sent: Thursday, 26 February 2009 11:09 AM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Question on lstsq and correlation coeff

On Wed, Feb 25, 2009 at 6:21 PM, Anthony Kong <Anthony.Kong@macquarie.com> wrote:
> Hi, all,
>
> It is probably a newbie question.
>
> I trying to use scipy/numpy in a finanical context. I want to compute 
> the correlation coeff of two series (returns vs index returns). I 
> tried two appoarches
>
> Firstly,
>
> from scipy.linalg import lstsq
> coeffs,a,b,c = lstsq(matrix, returns) # matrix contains index returns
>
> then I tried,
>
> import numpy as np
> cov = np.cov(idx1, returns)
> print cov.tolist()
> stddev_x = np.std(returns, ddof=1)
> stddev_y = np.std(idx1, ddof=1)
> print "cor = %s" % (cov.tolist()[:-1] /(stddev_x * stddev_y)) They 
> differ from each other.
>
> As you can see from the numpy example, I am trying to find cor coeff 
> for a sample. (ddof=1)
>
> So, my question is: is the discrepency caused by the fact that I am 
> trying to use lstsq() on a 'sample popluation' (i.e. I am not 
> regressing a full return series)? Is it correct to use lstsq() this way?
>

the most direct way to calculate the correlation matrix, use index [0,1] to get coefficient.

numpy.corrcoef(x, y=None, rowvar=1, bias=0)

np.cov, that you used, uses biased estimator, denominator = N by default, but for std you used N-1

Josef
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

NOTICE
This e-mail and any attachments are confidential and may contain copyright material of Macquarie Group Limited or third parties. If you are not the intended recipient of this email you should not read, print, re-transmit, store or act in reliance on this e-mail or any attachments, and should destroy all copies of them. Macquarie Group Limited does not guarantee the integrity of any emails or any attached files. The views or opinions expressed are the author's own and may not reflect the views or opinions of Macquarie Group Limited.



More information about the Numpy-discussion mailing list