[Numpy-discussion] Correlation filter
josef.pktd@gmai...
josef.pktd@gmai...
Fri Nov 20 15:27:47 CST 2009
On Fri, Nov 20, 2009 at 2:28 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
> On Fri, Nov 20, 2009 at 11:17 AM, <josef.pktd@gmail.com> wrote:
>> On Fri, Nov 20, 2009 at 1:51 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
>>> def corr3(x, y):
>>> x = x - x.mean()
>>> x /= x.std()
>>> nx = x.size
>>> one = np.ones(nx)
>>> xy = lfilter(x, 1, y)
>>> sy = lfilter(one, 1, y)
>>> sy2 = lfilter(one, 1, y*y)
>>> d = xy / np.sqrt(nx * sy2 - sy * sy)
>>> return d
>>
>> Is this correct? xy uses the original y and not a demeaned y.
>
> I wouldn't be surprised if I made mistakes. But I don't think I need
> to demean y. One way to write the numerator of the correlation
> coefficent is
>
> N*sum(xy) - sum(x)*sum(y)
>
> but sum(x) is zero, so it becomes
>
> N*sum(xy)
>
> So I think I only need to demean x. But I'd better test the code. Plus
> I have no idea how lfilter will handle my missing values (NaN).
I think nans fully propagate, trick might be to fill with zero and use
another convolution/correlation with the non-nan mask to get the
number of observations included (I think I saw it as a trick by Pierre
for autocorrelation with missing observations).
Here is a moving slope function which estimate the slope coefficient
for a linear trend if x is arange()
I compared the ranking with your moving correlation and it is quite different.
def moving_slope(x,y)
'''estimate moving slope coefficient of regression of y on x
filters along axis=1, returns valid observations
Todo: axis and lag options
'''
xx = np.column_stack((np.ones(x.shape), x))
pinvxx = np.linalg.pinv(xx)[1:,:]
windsize = len(x)
lead = windsize//2 - 1
return signal.correlate(y, pinvxx, 'full'
)[:,windsize-lead:-(windsize+1*lead-2)]
Josef
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
More information about the NumPy-Discussion
mailing list