[Numpy-discussion] Definition of correlation, correlate and so on ?

Alan G Isaac aisaac at american.edu
Tue Dec 12 08:40:02 CST 2006

> On 12/12/06, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote: 
>> I am polishing some code to compute autocorrelation using 
>> fft, and when testing the code against numpy.correlate, 
>> I realised that I am not sure about the definition... 
>> There are various function related to correlation as far 
>> as numpy/scipoy is concerned: 
>>     numpy.correlate 
>>     numpy.corrcoef 
>>     scipy.signal.correlate 
>>     For me, the correlation between two sequences X and Y at lag t is 
>> the sum(X[i] * Y*[i+lag]) where Y* is the complex conjugate of Y. 
>> numpy.correlate does not use the conjugate, scipy.signal.correlate as 
>> well, and I don't understand numpy.corrcoef. I've never seen complex 
>> correlation used without the conjugate, so I was curious why this 

On Tue, 12 Dec 2006, Charles R Harris apparently wrote: 
> Neither have I, it is one of those oddities that may have 
> been inherited from Numeric. I wouldn't mind seeing it 
> changed but it is probably a bit late for that. 

I hope that "too late" is not a determining argument!

I hope the argument will address the following:
- was there a justification for the extant behavior? If so, 
  what was it, and does it still seem valid?
- is the current definition reasonable; does it match 
  definitions in use in at least some domain?
- if not, is this behavior so unexpected as to be considered 
  a bug?
- are many existing applications depending on it?

The worst case is:
it is a bug, but many existing users depend on the current behavior.
I am not taking a position, but that seems the current view on this list.
I hope that *if* that is the assessment, then a transition 
path will be plotted.  For example, a keyword could be 
added, with a proper default, and a warning emitted when it 
is not set.

Alan Isaac

More information about the Numpy-discussion mailing list