[Numpy-discussion] Correlation filter

josef.pktd@gmai... josef.pktd@gmai...
Fri Nov 20 15:27:47 CST 2009


On Fri, Nov 20, 2009 at 2:28 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
> On Fri, Nov 20, 2009 at 11:17 AM,  <josef.pktd@gmail.com> wrote:
>> On Fri, Nov 20, 2009 at 1:51 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
>>> def corr3(x, y):
>>>    x = x - x.mean()
>>>    x /= x.std()
>>>    nx = x.size
>>>    one = np.ones(nx)
>>>    xy = lfilter(x, 1, y)
>>>    sy = lfilter(one, 1, y)
>>>    sy2 = lfilter(one, 1, y*y)
>>>    d = xy / np.sqrt(nx * sy2 - sy * sy)
>>>    return d
>>
>> Is this correct? xy uses the original y and not a demeaned y.
>
> I wouldn't be surprised if I made mistakes. But I don't think I need
> to demean y. One way to write the numerator of the correlation
> coefficent is
>
> N*sum(xy) - sum(x)*sum(y)
>
> but sum(x) is zero, so it becomes
>
> N*sum(xy)
>
> So I think I only need to demean x. But I'd better test the code. Plus
> I have no idea how lfilter will handle my missing values (NaN).

I think nans fully propagate, trick might be to fill with zero and use
another convolution/correlation with the non-nan mask to get the
number of observations included (I think I saw it as a trick by Pierre
for autocorrelation with missing observations).

Here is a moving slope function which estimate the slope coefficient
for a linear trend if x is arange()
I compared the ranking with your moving correlation and it is quite different.


def moving_slope(x,y)
    '''estimate moving slope coefficient of regression of y on x
    filters along axis=1, returns valid observations
    Todo: axis and lag options
    '''
    xx = np.column_stack((np.ones(x.shape), x))
    pinvxx = np.linalg.pinv(xx)[1:,:]
    windsize = len(x)
    lead = windsize//2 - 1
    return signal.correlate(y, pinvxx, 'full'
)[:,windsize-lead:-(windsize+1*lead-2)]


Josef

> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


More information about the NumPy-Discussion mailing list