Cross-Correlation VS Convolution
Introduction
Some of my friends who have some background in electrical engineering sometimes told me that computer science people stole the idea of convolution from electrical engineering and applied it to deep learning and claimed it they invented it. They further told me that computer science people did not use the idea correctly and the convolution used in the deep learning is not the original convolution used in electrical engineering. In fact, it is cross-correlation instead of convolution.
I have no idea whether computer science people stole the convolution idea from electrical engineering or not. But in my opinion, cross-correlation and convolution are mathematically equivalent in a neural network. In this blog post, I would like to go over the definitions and some of the properties of cross-correlation and convolution, and discuss their applications in deep learning mathematically.
Cross-Correlation
For continuous complex-valued functions
Similarly, for discrete sequences, the cross-correlation is defined as
Here we denote the cross-correlation operation using
Convolution
For continuous complex-valued functions
Similarly, for discrete sequences, the convolution is defined as
Here we denote the convolution operation using
Cross-Correlation VS Convolution
Cross-correlation and convolution can be converted to each other. Concretely,
Let’s show a quick proof for the continuous functions as an example.
Proof
This concludes the proof.
We could also verify the equivalence using Scipy for the 1D and 2D scenarios.
1 | from typing import Tuple, Union |
Convolution in Deep Learning
From the definitions, we could see that actually the convolutions in deep learning, where the input
Then why calling cross-correlation as convolution in deep learning is still somewhat valid? This is because the filter
We could apply real convolutions
So obviously, after the neural networks are trained using the same training configurations, the trained filters for the convolution and the cross-correlation are just the flip of each other, i.e.,
In practice, we do cross-correlation instead of convolution in the implementation, because the mathematical expression for cross-correlation just look more intuitive than that for convolution.
References
Cross-Correlation VS Convolution
https://leimao.github.io/blog/Cross-Correlation-VS-Convolution/