Lei Mao bio photo

Lei Mao

Machine Learning, Artificial Intelligence, Computer Science.

Twitter Facebook LinkedIn GitHub   G. Scholar E-Mail RSS


In quantum mechanics, one of the key discoveries is that it is not always possible that we could measure two physical observables precisely, which is the Heisenberg’s general uncertainty principle.

We might have learned a special case of the Heisenberg’s general uncertainty principle from high school or college physics courses that the more precisely the position of some particle is determined, the less precisely its momentum can be predicted from initial conditions, and vice versa. In this case, the two physical observables are position and momentum.

Formally, the Heisenberg’s general uncertainty principle states that the product of the variances of two arbitrary hermitian operators on a given state is always greater than or equal to one-fourth the square of the expected value of their commutator. In formulas:

\[\mathbb{V}_{\psi}(\Omega_1) \mathbb{V}_{\psi}(\Omega_2) \geq \frac{1}{4} | \langle [\Omega_1, \Omega_2] \rangle_{\psi} | ^2\]

where $\mathbb{V}_{\psi}(\Omega_1)$ and $\mathbb{V}_{\psi}(\Omega_2)$ are using the notations from my previous post, $[\Omega_1, \Omega_2]$ is called the commutator of $\Omega_1$ and $\Omega_2$, and $[\Omega_1, \Omega_2] = \Omega_1 \Omega_2 - \Omega_2 \Omega_1$.

In this blog post, I would like to show a mathematical proof to Heisenberg’s general uncertainty principle.


We would use the following four theorems and properties to prove Heisenberg’s general uncertainty principle.

Cauchy–Schwarz Inequality

The Cauchy–Schwarz inequality states that for all vectors $u$ and $v$ of an inner product space it is true that

\[\langle u, u \rangle \langle v, v \rangle \geq | \langle u, v \rangle | ^ 2\]

The proof could be found on Wikipedia.

Imaginary Part of Vector Inner Product

The imaginary part of the inner product of all vectors $u$ and $v$, $\text{Im}\big(\langle u, v \rangle\big)$, could be computed as

\[\text{Im}\big( \langle u, v \rangle \big) = \frac{1}{2i} \big(\langle u, v \rangle - \langle v, u \rangle \big)\]

Assuming $u = a_1 + b_1 i$ and $v = a_2 + b_2 i$, where $a_1$, $a_2$, $b_1$, and $b_2$ are real (column) vectors.

\[\begin{align} \langle u, v \rangle &= v^{\dagger} u \\ &= (a_2 + b_2 i)^{\dagger} (a_1 + b_1 i) \\ &= (a_2^{\top} - b_2^{\top} i) (a_1 + b_1 i) \\ &= (a_2^{\top} a_1 + b_2^{\top} b_1) + (a_2^{\top} b_1 - b_2^{\top} a_1) i \\ &= \text{Re}\big(\langle u, v \rangle\big) + \text{Im}\big(\langle u, v \rangle\big) i\\ \end{align}\] \[\begin{align} \langle v, u \rangle &= u^{\dagger} v \\ &= (a_1 + b_1 i)^{\dagger} (a_2 + b_2 i) \\ &= (a_1^{\top} - b_1^{\top} i) (a_2 + b_2 i) \\ &= (a_1^{\top} a_2 + b_1^{\top} b_2) + (a_1^{\top} b_2 - b_1^{\top} a_2) i \\ &= (a_2^{\top} a_1 + b_2^{\top} b_1) + (a_2^{\top} b_1 - b_2^{\top} a_1) i \\ &= \text{Re}\big(\langle u, v \rangle\big) - \text{Im}\big(\langle u, v \rangle\big) i\\ \end{align}\]

It is trivial to see that

\[\text{Im}\big( \langle u, v \rangle \big) = \frac{1}{2i} \big(\langle u, v \rangle - \langle v, u \rangle \big)\]

This concludes the proof.

Triangle Inequality

Let $z$ be any complex number, we have

\[| z | ^2 \geq | \text{Im}(z) | ^2\]

This should be very straightforward to the people who are familiar with triangles and know the polar coordinate expression of complex numbers. We will skip the formal proof here.

Hermitian Property

If A is a hermitian $n$-by-$n$ matrix, then for all $u, v^{\prime} \in \mathbb{C}$. we have

\[\langle Au, v \rangle = \langle u, Av \rangle\]

Using the property of hermitian matrix $A^{\dagger} = A$, we simply have

\[\begin{align} \langle Au, v \rangle &= v^{\dagger} Au \\ &= v^{\dagger} A^{\dagger} u \\ &= {(Av)}^{\dagger} u \\ &= \langle u, Av \rangle \end{align}\]

This concludes the proof.

Proof to Heisenberg’s General Uncertainty Principle

Because $\Delta_{\psi}(\Omega) = \Omega - \langle \Omega \rangle_{\psi} I$ and $\Omega$ is a hermitian operator, therefore $\Delta_{\psi}(\Omega)$ is also a hermitian operator. Based on the definition of variance, using the hermitian property, we further have

\[\begin{align} \mathbb{V}_{\psi}(\Omega) &= \langle (\Delta_{\psi}(\Omega)) (\Delta_{\psi}(\Omega)) \rangle_{\psi} \\ &= \langle (\Delta_{\psi}(\Omega)) (\Delta_{\psi}(\Omega))\psi, \psi \rangle \\ &= \langle (\Delta_{\psi}(\Omega))\psi, (\Delta_{\psi}(\Omega)) \psi \rangle \\ \end{align}\]

We apply the Cauchy–Schwarz inequality to the left side of the Heisenberg’s general uncertainty principle,

\[\begin{align} \mathbb{V}_{\psi}(\Omega_1) \mathbb{V}_{\psi}(\Omega_2) &= \langle (\Delta_{\psi}(\Omega_1))\psi, (\Delta_{\psi}(\Omega_1)) \psi \rangle \langle (\Delta_{\psi}(\Omega_2))\psi, (\Delta_{\psi}(\Omega_2)) \psi \rangle \\ &\geq | \langle (\Delta_{\psi}(\Omega_1))\psi, (\Delta_{\psi}(\Omega_2))\psi \rangle | ^ 2 \end{align}\]

We further apply the Triangle inequality,

\[\begin{align} \mathbb{V}_{\psi}(\Omega_1) \mathbb{V}_{\psi}(\Omega_2) &\geq | \langle (\Delta_{\psi}(\Omega_1))\psi, (\Delta_{\psi}(\Omega_2))\psi \rangle | ^ 2 \\ &\geq \big| \text{Im}\big( \langle (\Delta_{\psi}(\Omega_1))\psi, (\Delta_{\psi}(\Omega_2))\psi \rangle \big) \big| ^ 2 \end{align}\]

We use the imaginary part of vector inner product,

\[\begin{align} \mathbb{V}_{\psi}(\Omega_1) \mathbb{V}_{\psi}(\Omega_2) &\geq \big| \text{Im}\big( \langle (\Delta_{\psi}(\Omega_1))\psi, (\Delta_{\psi}(\Omega_2))\psi \rangle \big) \big| ^ 2 \\ &= \Big| \frac{1}{2i} \big(\langle (\Delta_{\psi}(\Omega_1))\psi, (\Delta_{\psi}(\Omega_2))\psi \rangle - \langle (\Delta_{\psi}(\Omega_2))\psi, \langle (\Delta_{\psi}(\Omega_1))\psi \rangle \big) \Big| ^ 2 \\ \end{align}\]

We use the hermitian property again,

\[\begin{align} \mathbb{V}_{\psi}(\Omega_1) \mathbb{V}_{\psi}(\Omega_2) &\geq \Big| \frac{1}{2i} \big(\langle (\Delta_{\psi}(\Omega_1))\psi, (\Delta_{\psi}(\Omega_2))\psi \rangle - \langle (\Delta_{\psi}(\Omega_2))\psi, \langle (\Delta_{\psi}(\Omega_1))\psi \rangle \big) \Big| ^ 2 \\ &= \Big| \frac{1}{2i} \big(\langle (\Delta_{\psi}(\Omega_2)) (\Delta_{\psi}(\Omega_1))\psi, \psi \rangle - \langle (\Delta_{\psi}(\Omega_1)) (\Delta_{\psi}(\Omega_2))\psi, \psi \rangle \big) \Big| ^ 2 \\ &= \Big| \frac{1}{2i} \big(\langle ( (\Delta_{\psi}(\Omega_2)) (\Delta_{\psi}(\Omega_1)) - (\Delta_{\psi}(\Omega_1)) (\Delta_{\psi}(\Omega_2)) ) \psi, \psi \rangle \big) \Big| ^ 2 \\ &= \Big| \frac{1}{2i} \big(\langle ( (\Omega_2 - \langle \Omega_2 \rangle_{\psi} I) (\Omega_1 - \langle \Omega_1 \rangle_{\psi} I) - (\Omega_1 - \langle \Omega_1 \rangle_{\psi} I) (\Omega_2 - \langle \Omega_2 \rangle_{\psi} I) ) \psi, \psi \rangle \big) \Big| ^ 2 \\ &= \Big| \frac{1}{2i} \big(\langle ( (\Omega_2 \Omega_1 - \langle \Omega_1 \rangle_{\psi} \Omega_2 - \langle \Omega_2 \rangle_{\psi} \Omega_1 + \langle \Omega_1 \rangle_{\psi} \langle \Omega_2 \rangle_{\psi} ) - (\Omega_1 \Omega_2 - \langle \Omega_2 \rangle_{\psi} \Omega_1 - \langle \Omega_1 \rangle_{\psi} \Omega_2 + \langle \Omega_1 \rangle_{\psi} \langle \Omega_2 \rangle_{\psi} ) ) \psi, \psi \rangle \big) \Big| ^ 2 \\ &= \Big| \frac{1}{2i} \big(\langle (\Omega_2 \Omega_1 - \Omega_1 \Omega_2) \psi, \psi \rangle \big) \Big| ^ 2 \\ &= \Big| \frac{1}{2i} \big(\langle - [\Omega_1, \Omega_2] \psi, \psi \rangle \big) \Big| ^ 2 \\ &= \Big| -\frac{1}{2i} \big(\langle [\Omega_1, \Omega_2] \psi, \psi \rangle \big) \Big| ^ 2 \\ &= \Big| -\frac{1}{2i} \langle [\Omega_1, \Omega_2] \rangle_{\psi} \Big| ^ 2 \\ &= \frac{1}{4} | \langle [\Omega_1, \Omega_2] \rangle_{\psi} | ^ 2 \\ \end{align}\]

This concludes the proof.


In Heisenberg’s general uncertainty principle, the commutator $[\Omega_1, \Omega_2] = 0$ suggests $\Omega_1 \Omega_2 = \Omega_2 \Omega_1$ and $\Omega_1 \Omega_2 | \psi \rangle = \Omega_2 \Omega_1 | \psi \rangle$. This means that the system states after the two sequential measures $\Omega_1 \Omega_2$ and $\Omega_1 \Omega_2$ are the same if $[\Omega_1, \Omega_2] = 0$. Otherwise, if $[\Omega_1, \Omega_2] \neq 0$ the system states after the two sequential measures $\Omega_1 \Omega_2$ and $\Omega_1 \Omega_2$ are not the same. If the commutator $[\Omega_1, \Omega_2] = 0$, Heisenberg’s general uncertainty principle suggests that the two physical observables that $\Omega_1$ and $\Omega_2$ are measuring would have not limit in precision.

In other expressions of Heisenberg’s general uncertainty principle, sometimes we would see the word “simultaneity”. How to understand the simultaneity in Heisenberg’s general uncertainty principle? Given measurement would change the system state, and it is almost impossible to achieve the absolute simultaneity in the time domain, what does the simultaneity mean in this case? Here simultaneity means that the order of measurement $\Omega_1$ and $\Omega_2$ do not change the final observation, as we tried hard to make them simultaneous and it is impossible to control the exact order of these two measurements. In short, simultaneity just means $[\Omega_1, \Omega_2] = 0$.

How to measure the variance of physical observable for a system state, as is shown at the left side of the inequality of Heisenberg’s general uncertainty principle? Measurement changes system state in quantum mechanics. We would need to create lots of clones of the system state. Once a system state is measured, it should be discarded and not be used for the measurement for the physical observable anymore. Measurement on the same system state means measurement on system state clones and not the exact system state.


It is quite amazing that the Heisenberg’s general uncertainty principle could be derived in a such simple way. Unfortunately, the my college physics course instructor never showed a proof to this important principle.