Talk:Cross-correlation

From Wikipedia, the free encyclopedia

This article is within the scope of WikiProject Statistics, which collaborates to improve Wikipedia's coverage of statistics. If you would like to participate, please visit the project page.

WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, which collaborates on articles related to mathematics.
Mathematics rating: Start Class Mid Priority  Field: Probability and statistics
One of the 500 most frequently viewed mathematics articles.

building up text to add once verified..

given a reference signal and an input signal,
sref = 01011010010110000010111101111001001011010010111000100101101111
sinp = 01111011100100111000000111110011110010011100100111100001001110
the cross-correlation of reference signal with the input signal reaches its maximum of 0.61 when the input signal is rotated to the left 5 places (\dt = -5).
Image:cross-correlation.gif

Waveguy 06:10, 28 Jan 2005 (UTC)


Things to cover:

  • Different variations, like the above binary signals, "regular old" digital signals like PCM audio, 2D cross-correlation of images, etc.
  • Circular cross-correlation
  • Faster calculation with the use of FFTs

- Omegatron 04:41, Mar 20, 2005 (UTC)

Contents

[edit] Move to Cross covariance

Please look at cross covariance article.

I moved the orginal definition (without dividing sigmas) to the cross covariance page. I know there is a lot of disagreement on the difference between covariance and correlation or whether there is a difference, but It seems to be the consensus of the relevant pages that correlation does involve dividing by the sigmas, while covariance does not divide by the sigmas. See Covariance matrix for a short discussion. So, since the new stuff added did not divide by the sigmas, I reverted back to the original. Here is a table I have been using for the relevant pages.

NO SIGMAS WITH SIGMAS
Covariance Correlation
Cross covariance Cross correlation see ext
Autocovariance Autocorrelation
Covariance matrix Correlation matrix
Estimation of covariance matrices

PAR 02:35, 10 July 2005 (UTC)

Discrete-Time Signal Processing by Oppenheim, Schafer, and Buck, which is the definitive textbook for DSP, defines the cross-correlation of two signals without dividing by any sigma. Numerical Recipes in C by Press et al. also defines it without dividing by sigma.
What you are calling the "cross correlation", dividing by sigma, is called the "linear-correlation coefficient" in the statistics text I happen to have on my shelf (Data Reduction and Error Analysis for the Physical Sciences by Bevington and Robinson.)
Perhaps there is a difference in usage between the statistics and signal-processing/engineering communities. Even if so, it is not Wikipedia's place to annoint one usage as the "right" one.
—Steven G. Johnson 19:13, July 10, 2005 (UTC)

I'm not annointing here, I'm just trying to clarify things. Looking at the table above, the cross-correlation was the only article that was in conflict with every other article in the table as far as the sigmas was concerned, so I changed it. If you have a better idea, lets do it. PAR 01:52, 11 July 2005 (UTC)

Please realize that the comments hear apply to every other correlation article in the table. I think that the articles should list forms both with and without sigma, probably merging the covariance/correlation articles to avoid duplication, explain the context for the different usages in signal processing and statistics, and explain the impact of the sigma. As it stands, Wikipedia is annointing one particular usage as the correct one, which is wrong. —Steven G. Johnson 15:55, July 11, 2005 (UTC)
It seems that the definition is ambiguous. - Either we need to find the dominant definition and go with that, or we have to present both. Cburnett 19:23, July 11, 2005 (UTC)

After checking 7 different statistics books, the following is unanimous:

  • The covariance of two different random variates X and Y is
Cov(X,Y)=E( (X-E(X)) (Y-E(Y)) ) where E(X) is expectation value of X.
  • The (linear) correlation coefficient is
R(X,Y) = Cov(X,Y)/(S(X) S(Y)) where S(X) is the std. deviation of X.

Oppenheim et al is the only one to define cross-covariance and cross-correlation and they do it in a very consistent way:

cross correlation \phi_{xy}  =E[ x_n\,y_{n+m}]
cross covariance \gamma_{xy}=E[ (x_n-E[x])\,(y_{n+m}-E[y])]
autocorrelation \phi_{xx}  =E[ x_n\,x_{n+m}]
autocovariance \gamma_{xx}=E[ (x_n-E[x])\,(x_{n+m}-E[x])]

I think it's clear that my moving cross-correlation to cross-covariance was wrong. It's not the sigmas that distinguish correlation from covariance, it's the subtraction of the means. The division by the sigmas is another issue. I would like to alter all articles to conform with Oppenheim's definition, and add very clearly the alternate definitions. There will be no conflict with the 7 books I mentioned, but there will be a clear conflict with the autocorrelation article as it stands. I understand that we do not want to favor a particular set of definitions if there is any controversy, but it seems that the controversy is not so bad, and we do want some clarity and predictability to these articles, rather than conflicting or missing definitons. I will make these changes in a day or two unless there is an objection. PAR 00:18, 12 July 2005 (UTC)

I have also seen "correlation function" used in physical contexts for the subtracted-mean version. The difference is also blurred because in many important cases the mean is zero. I would prefer if auto-correlation, auto-covariance, cross-correlation, and cross-covariance were all defined on a single page (with appropriate redirects). Mathematically, they are so closely related that it hardly makes sense to me to have separate pages. (I'm not sure what to do about the dividing-by-sigma variant, since I'm not so familiar with that). —Steven G. Johnson 01:37, July 12, 2005 (UTC)

OK - how about this: A page entitled "covariance and correlation" or something. It explains that there are conflicting definitions. Then it adopts Oppenheim's definitions for the sake of clarity, not because they are best, but because we need to settle on some consistent way of speaking about things. Then it redirects to the various pages, each of which is rewritten consistent with these definitions, including the important alternate definitions. They are also rewritten as if they might be subsections of the main page. If after this is all done, they look like they ought to go in the main article, we can do that. That way there's no big lump of work that needs to be done, it can all be done slowly, in pieces. PAR 02:07, 12 July 2005 (UTC)

Sounds good to me. —Steven G. Johnson 02:45, July 12, 2005 (UTC)

Ok - I put up a "starter page" at Covariance and correlation. PAR 04:06, 12 July 2005 (UTC)

By the way, I don't think there is anything wrong with an editorial policy that enforces consistent terminology. This is not at odds with NPOV: alternative definitions should be mentioned, but for the sake of coherence and consistency a common set of terms should be used. See for example groupoid vs. magma (algebra) for a precedent. --MarkSweep 16:58, 12 July 2005 (UTC)

[edit] some example

i need examples of mean,median,mode,variablity,range,variance,co-relation,standard deviation ,skewness related to marketing

[edit] f*?

The article mentions f * in many equations but doesn't define it. What's f * ? —Ben FrantzDale 20:07, 19 December 2006 (UTC)

A superscript asterisk indicates the complex conjugate. --Abdull 21:06, 20 May 2007 (UTC)

[edit] Zero-Lag?

Could someone please add information on the zero-lag? We also need to mention that cross-correlation operation is associative, distributive but not commutative.

[edit] Cross-correlation and convolution

I don't see yet why f(t)\star g(t) = f^*(-t)*g(t). Let's simplify things to f(t) and g(t) being real functions, therefore f(t)\star g(t) = f(-t)*g(t).

As f(t) * g(t) = \int_{a}^{b} f(\tau) g(t - \tau)\, d\tau, what does f( − t) * g(t) look like expressed in integral form?

Besides, if convolution is commutative and cross-correlation is not commutative why can you say f(t)\star g(t) = f^*(-t)*g(t) at all? --Abdull 21:23, 20 May 2007 (UTC)

[edit] Dangling citation

The in-text citation "(Campbell, Lo, and MacKinlay 1996)" isn't all that useful without the actual reference. Could whoever put that in please add the full citation in a "references" section?

[edit] appropriate integration limits

The article curently does not describe what the appropriate integration limits are. Can someone who knows the answer please add them? Ngchen 14:58, 7 October 2007 (UTC)

[edit] Possible split

I think that the usage and terminology in "signal processing" and in "statistics" are so different that a split into articles specific to each is required. Melcombe (talk) 11:47, 26 March 2008 (UTC)