# Heisenberg's Uncertainty Principle

## Introduction

In quantum mechanics, one of the key discoveries is that it is not always possible that we could measure two physical observables precisely, which is the Heisenberg’s general uncertainty principle.

We might have learned a special case of the Heisenberg’s general uncertainty principle from high school or college physics courses that the more precisely the position of some particle is determined, the less precisely its momentum can be predicted from initial conditions, and vice versa. In this case, the two physical observables are position and momentum.

Formally, the Heisenberg’s general uncertainty principle states that the product of the variances of two arbitrary hermitian operators on a given state is always greater than or equal to one-fourth the square of the expected value of their commutator. In formulas:

$$\mathbb{V}_{\psi}(\Omega_1) \mathbb{V}_{\psi}(\Omega_2) \geq \frac{1}{4} | \langle [\Omega_1, \Omega_2] \rangle_{\psi} | ^2$$

where $\mathbb{V}_{\psi}(\Omega_1)$ and $\mathbb{V}_{\psi}(\Omega_2)$ are using the notations from my previous post, $[\Omega_1, \Omega_2]$ is called the commutator of $\Omega_1$ and $\Omega_2$, and $[\Omega_1, \Omega_2] = \Omega_1 \Omega_2 - \Omega_2 \Omega_1$.

In this blog post, I would like to show a mathematical proof for Heisenberg’s general uncertainty principle.

## Prerequisites

We would use the following four theorems and properties to prove Heisenberg’s general uncertainty principle.

### Cauchy–Schwarz Inequality

The Cauchy–Schwarz inequality states that for all vectors $u$ and $v$ of an inner product space it is true that

$$\langle u, u \rangle \langle v, v \rangle \geq | \langle u, v \rangle | ^ 2$$

The proof could be found on Wikipedia.

### Imaginary Part of Vector Inner Product

The imaginary part of the inner product of all vectors $u$ and $v$, $\text{Im}\big(\langle u, v \rangle\big)$, could be computed as

$$\text{Im}\big( \langle u, v \rangle \big) = \frac{1}{2i} \big(\langle u, v \rangle - \langle v, u \rangle \big)$$

Assuming $u = a_1 + b_1 i$ and $v = a_2 + b_2 i$, where $a_1$, $a_2$, $b_1$, and $b_2$ are real (column) vectors.

\begin{align} \langle u, v \rangle &= v^{\dagger} u \\ &= (a_2 + b_2 i)^{\dagger} (a_1 + b_1 i) \\ &= (a_2^{\top} - b_2^{\top} i) (a_1 + b_1 i) \\ &= (a_2^{\top} a_1 + b_2^{\top} b_1) + (a_2^{\top} b_1 - b_2^{\top} a_1) i \\ &= \text{Re}\big(\langle u, v \rangle\big) + \text{Im}\big(\langle u, v \rangle\big) i\\ \end{align}

\begin{align} \langle v, u \rangle &= u^{\dagger} v \\ &= (a_1 + b_1 i)^{\dagger} (a_2 + b_2 i) \\ &= (a_1^{\top} - b_1^{\top} i) (a_2 + b_2 i) \\ &= (a_1^{\top} a_2 + b_1^{\top} b_2) + (a_1^{\top} b_2 - b_1^{\top} a_2) i \\ &= (a_2^{\top} a_1 + b_2^{\top} b_1) + (a_2^{\top} b_1 - b_2^{\top} a_1) i \\ &= \text{Re}\big(\langle u, v \rangle\big) - \text{Im}\big(\langle u, v \rangle\big) i\\ \end{align}

It is trivial to see that

$$\text{Im}\big( \langle u, v \rangle \big) = \frac{1}{2i} \big(\langle u, v \rangle - \langle v, u \rangle \big)$$

This concludes the proof.

### Triangle Inequality

Let $z$ be any complex number, we have

$$| z | ^2 \geq | \text{Im}(z) | ^2$$

This should be very straightforward to the people who are familiar with triangles and know the polar coordinate expression of complex numbers. We will skip the formal proof here.

### Hermitian Property

If A is a hermitian $n$-by-$n$ matrix, then for all $u, v^{\prime} \in \mathbb{C}$. we have

$$\langle Au, v \rangle = \langle u, Av \rangle$$

Using the property of hermitian matrix $A^{\dagger} = A$, we simply have

\begin{align} \langle Au, v \rangle &= v^{\dagger} Au \\ &= v^{\dagger} A^{\dagger} u \\ &= {(Av)}^{\dagger} u \\ &= \langle u, Av \rangle \end{align}

This concludes the proof.

## proof for Heisenberg’s General Uncertainty Principle

Because $\Delta_{\psi}(\Omega) = \Omega - \langle \Omega \rangle_{\psi} I$ and $\Omega$ is a hermitian operator, therefore $\Delta_{\psi}(\Omega)$ is also a hermitian operator. Based on the definition of variance, using the hermitian property, we further have

\begin{align} \mathbb{V}_{\psi}(\Omega) &= \langle (\Delta_{\psi}(\Omega)) (\Delta_{\psi}(\Omega)) \rangle_{\psi} \\ &= \langle (\Delta_{\psi}(\Omega)) (\Delta_{\psi}(\Omega))\psi, \psi \rangle \\ &= \langle (\Delta_{\psi}(\Omega))\psi, (\Delta_{\psi}(\Omega)) \psi \rangle \\ \end{align}

We apply the Cauchy–Schwarz inequality to the left side of the Heisenberg’s general uncertainty principle,

\begin{align} \mathbb{V}_{\psi}(\Omega_1) \mathbb{V}_{\psi}(\Omega_2) &= \langle (\Delta_{\psi}(\Omega_1))\psi, (\Delta_{\psi}(\Omega_1)) \psi \rangle \langle (\Delta_{\psi}(\Omega_2))\psi, (\Delta_{\psi}(\Omega_2)) \psi \rangle \\ &\geq | \langle (\Delta_{\psi}(\Omega_1))\psi, (\Delta_{\psi}(\Omega_2))\psi \rangle | ^ 2 \end{align}

We further apply the Triangle inequality,

\begin{align} \mathbb{V}_{\psi}(\Omega_1) \mathbb{V}_{\psi}(\Omega_2) &\geq | \langle (\Delta_{\psi}(\Omega_1))\psi, (\Delta_{\psi}(\Omega_2))\psi \rangle | ^ 2 \\ &\geq \big| \text{Im}\big( \langle (\Delta_{\psi}(\Omega_1))\psi, (\Delta_{\psi}(\Omega_2))\psi \rangle \big) \big| ^ 2 \end{align}

We use the imaginary part of vector inner product,

\begin{align} \mathbb{V}_{\psi}(\Omega_1) \mathbb{V}_{\psi}(\Omega_2) &\geq \big| \text{Im}\big( \langle (\Delta_{\psi}(\Omega_1))\psi, (\Delta_{\psi}(\Omega_2))\psi \rangle \big) \big| ^ 2 \\ &= \Big| \frac{1}{2i} \big(\langle (\Delta_{\psi}(\Omega_1))\psi, (\Delta_{\psi}(\Omega_2))\psi \rangle - \langle (\Delta_{\psi}(\Omega_2))\psi, \langle (\Delta_{\psi}(\Omega_1))\psi \rangle \big) \Big| ^ 2 \\ \end{align}

We use the hermitian property again,

\begin{align} \mathbb{V}_{\psi}(\Omega_1) \mathbb{V}_{\psi}(\Omega_2) &\geq \Big| \frac{1}{2i} \big(\langle (\Delta_{\psi}(\Omega_1))\psi, (\Delta_{\psi}(\Omega_2))\psi \rangle - \langle (\Delta_{\psi}(\Omega_2))\psi, \langle (\Delta_{\psi}(\Omega_1))\psi \rangle \big) \Big| ^ 2 \\ &= \Big| \frac{1}{2i} \big(\langle (\Delta_{\psi}(\Omega_2)) (\Delta_{\psi}(\Omega_1))\psi, \psi \rangle - \langle (\Delta_{\psi}(\Omega_1)) (\Delta_{\psi}(\Omega_2))\psi, \psi \rangle \big) \Big| ^ 2 \\ &= \Big| \frac{1}{2i} \big(\langle ( (\Delta_{\psi}(\Omega_2)) (\Delta_{\psi}(\Omega_1)) - (\Delta_{\psi}(\Omega_1)) (\Delta_{\psi}(\Omega_2)) ) \psi, \psi \rangle \big) \Big| ^ 2 \\ &= \Big| \frac{1}{2i} \big(\langle ( (\Omega_2 - \langle \Omega_2 \rangle_{\psi} I) (\Omega_1 - \langle \Omega_1 \rangle_{\psi} I) \\ &\qquad - (\Omega_1 - \langle \Omega_1 \rangle_{\psi} I) (\Omega_2 - \langle \Omega_2 \rangle_{\psi} I) ) \psi, \psi \rangle \big) \Big| ^ 2 \\ &= \Big| \frac{1}{2i} \big(\langle ( (\Omega_2 \Omega_1 - \langle \Omega_1 \rangle_{\psi} \Omega_2 - \langle \Omega_2 \rangle_{\psi} \Omega_1 + \langle \Omega_1 \rangle_{\psi} \langle \Omega_2 \rangle_{\psi} ) \\ &\qquad - (\Omega_1 \Omega_2 - \langle \Omega_2 \rangle_{\psi} \Omega_1 - \langle \Omega_1 \rangle_{\psi} \Omega_2 + \langle \Omega_1 \rangle_{\psi} \langle \Omega_2 \rangle_{\psi} ) ) \psi, \psi \rangle \big) \Big| ^ 2 \\ &= \Big| \frac{1}{2i} \big(\langle (\Omega_2 \Omega_1 - \Omega_1 \Omega_2) \psi, \psi \rangle \big) \Big| ^ 2 \\ &= \Big| \frac{1}{2i} \big(\langle - [\Omega_1, \Omega_2] \psi, \psi \rangle \big) \Big| ^ 2 \\ &= \Big| -\frac{1}{2i} \big(\langle [\Omega_1, \Omega_2] \psi, \psi \rangle \big) \Big| ^ 2 \\ &= \Big| -\frac{1}{2i} \langle [\Omega_1, \Omega_2] \rangle_{\psi} \Big| ^ 2 \\ &= \frac{1}{4} | \langle [\Omega_1, \Omega_2] \rangle_{\psi} | ^ 2 \\ \end{align}

This concludes the proof.

## Commutator

In the derivation above, we introduced a new operator $[\Omega_1, \Omega_2]$ which is called the commutator. $[\Omega_1, \Omega_2] = \Omega_2 \Omega_1 - \Omega_1 \Omega_2$. Even though $\Omega_1$ and $\Omega_2$ are hermitian, the commutator $[\Omega_1, \Omega_2]$ may or may not be hermitian.

Given a system state $| \psi \rangle$, what is $\Omega_2 \Omega_1 | \psi \rangle$ and $\Omega_1 \Omega_2 | \psi \rangle$?

If $\Omega_1$ and $\Omega_2$ have the same set of basis, i.e., eigenvectors, $| x_0 \rangle$, $| x_1 \rangle$, $\cdots$, $| x_{n-1} \rangle$. Note that even if $\Omega_1$ and $\Omega_2$ have the same set of basis, $\Omega_1$ and $\Omega_2$ can have different eigenvalues. For example, $\Omega_1$ and $k \Omega_1$, where $k$ is a constant scalar, have the same set of basis, but their eigenvalue are different by a factor of $k$.

After measuring the system state using the $\Omega_1$, the system state will collapse to a basic state, say $| x_i \rangle$, with probability $|c_i|^2$, and the observed value is the eigenvalue $\lambda_i$ correspond to the eigenvector $| x_i \rangle$ for $\Omega_1$. When measuring the collapsed the system state $| x_i \rangle$ using the $\Omega_2$, because $| x_i \rangle$ is also a basic state for $\Omega_2$, the collapsed state will still be $| x_i \rangle$ with probability 1.0, and the observed value is the eigenvalue $\eta_i$ correspond to the eigenvector $| x_i \rangle$ for $\Omega_2$. This means there is no uncertainty in the second measurement of the same type.

In this case,

\begin{align} \Omega_2 \Omega_1 | \psi \rangle &= \sum_{i=0}^{n-1} c_i \Omega_2 \Omega_1 | x_i \rangle \\ &= \sum_{i=0}^{n-1} c_i \lambda_i \Omega_2 | x_i \rangle \\ &= \sum_{i=0}^{n-1} c_i \eta_i \lambda_i | x_i \rangle \\ \end{align}

where

$$|c_0|^2 + |c_1|^2 + \cdots + |c_{n-1}|^2 = 1$$

The expected value of the product of the two measurements is

\begin{align} \langle \Omega_2 \Omega_1 \rangle_{\psi} &= \langle \psi | \Omega_2 \Omega_1 | \psi \rangle \\ &= \langle \psi, | \Omega_2 \Omega_1 | \psi \rangle \\ &= \left \langle \sum_{i=0}^{n-1} c_i | x_i \rangle, \sum_{i=0}^{n-1} c_i \eta_i \lambda_i | x_i \rangle \right \rangle \\ &= \sum_{i=0}^{n-1} \sum_{j=0}^{n-1} c_i \overline{c}_j \lambda_j \langle x_i | x_j \rangle \\ \end{align}

Notice that if $\Omega_1$ and $\Omega_2$ have the same basis. Because the eigenvectors of $\Omega$ is orthonormal, $\langle x_i, x_j \rangle = 1$ if $i = j$, otherwise $\langle x_i, x_j \rangle = 0$, we further have

\begin{align} \langle \Omega_2 \Omega_1 \rangle_{\psi} &= \sum_{i=0}^{n-1} \sum_{j=0}^{n-1} c_i \overline{c}_j \lambda_j \langle x_i | x_j \rangle \\ &= \sum_{i=0}^{n-1} | c_i |^2 \eta_i \lambda_i \\ \end{align}

Similarly, we could also get

\begin{align} \langle \Omega_1 \Omega_2 \rangle_{\psi} &= \sum_{i=0}^{n-1} | c_i |^2 \eta_i \lambda_i \\ \end{align}

Thus,

\begin{align} \langle \Omega_2 \Omega_1 \rangle_{\psi} &= \langle \psi | \Omega_2 \Omega_1 | \psi \rangle \\ &= \langle \psi | \Omega_1 \Omega_2 | \psi \rangle \\ &= \langle \Omega_1 \Omega_2 \rangle_{\psi} \\ \end{align}

Therefore, the expected measurement of the commutator $[\Omega_1, \Omega_2]$ on the system state $| \psi \rangle$ is

\begin{align} \langle [\Omega_1, \Omega_2] \rangle_{\psi} &= \langle \psi | [\Omega_1, \Omega_2] | \psi \rangle \\ &= \langle \psi | \Omega_2 \Omega_1 - \Omega_1 \Omega_2 | \psi \rangle \\ &= \langle \psi | \Omega_2 \Omega_1 | \psi \rangle - \langle \psi | \Omega_1 \Omega_2 | \psi \rangle \\ &= \langle \Omega_2 \Omega_1 \rangle_{\psi} - \langle \Omega_1 \Omega_2 \rangle_{\psi} \\ &= 0 \end{align}

Statistically, not just one single sample, but lots of samples, it makes no difference whether we measure the system state using the operator $\Omega_1$ or $\Omega_2$ first. As is described by the Heisenberg uncertainty principle, there is no limit from measuring both accurately, since the second measurement has no uncertainty at all given the first measurement is performed.

If $\Omega_1$ are $\Omega_2$ have the different sets of basis, i.e., eigenvectors. $\Omega_1$ uses $| x_0 \rangle$, $| x_1 \rangle$, $\cdots$, $| x_{n-1} \rangle$, whereas $\Omega_2$ uses $| y_0 \rangle$, $| y_1 \rangle$, $\cdots$, $| y_{n-1} \rangle$.

After measuring the system state using the $\Omega_1$, the system state will collapse to a basic state, say $| x_i \rangle$, with probability $|c_i|^2$, and the observed value is the eigenvalue $\lambda_i$ corresponding to the eigenvector $| x_i \rangle$. When measuring the collapsed the system state $| x_i \rangle$ using the $\Omega_2$, because $| x_i \rangle$ is a superposition of the basis of $\Omega_2$, the collapsed state will be $| y_j \rangle$ with probability $|d_{i,j}|^2$, and the observed value is the eigenvalue $\eta_i$ corresponding to the eigenvector $| y_i \rangle$.

In this case,

\begin{align} \Omega_2 \Omega_1 | \psi \rangle &= \sum_{i=0}^{n-1} c_i \Omega_2 \Omega_1 | x_i \rangle \\ &= \sum_{i=0}^{n-1} c_i \lambda_i \Omega_2 | x_i \rangle \\ \end{align}

The expected value of the product of the two measurements is

\begin{align} \langle \Omega_2 \Omega_1 \rangle_{\psi} &= \langle \psi | \Omega_2 \Omega_1 | \psi \rangle \\ &= \langle \psi, | \Omega_2 \Omega_1 | \psi \rangle \\ &= \left \langle \sum_{i=0}^{n-1} c_i | x_i \rangle, \sum_{i=0}^{n-1} c_i \lambda_i \Omega_2 | x_i \rangle \right \rangle \\ &= \sum_{i=0}^{n-1} \sum_{j=0}^{n-1} c_i \overline{c}_j \lambda_j \langle x_i | \Omega_2 | x_j \rangle \\ \end{align}

Notice that if $\Omega_1$ and $\Omega_2$ have the same basis, it will decay to the expected value we just derived.

\begin{align} \langle \Omega_2 \Omega_1 \rangle_{\psi} &= \sum_{i=0}^{n-1} \sum_{j=0}^{n-1} c_i \overline{c}_j \lambda_j \langle x_i | \Omega_2 | x_j \rangle \\ &= \sum_{i=0}^{n-1} \sum_{j=0}^{n-1} c_i \overline{c}_j \lambda_j \langle x_i | \eta_j | x_j \rangle \\ &= \sum_{i=0}^{n-1} \sum_{j=0}^{n-1} c_i \overline{c}_j \eta_j \lambda_j \langle x_i | x_j \rangle \\ &= \sum_{i=0}^{n-1} | c_i |^2 \eta_i \lambda_i \\ \end{align}

Similarly, we could also get

\begin{align} \langle \Omega_1 \Omega_2 \rangle_{\psi} &= \sum_{i=0}^{n-1} \sum_{j=0}^{n-1} d_i \overline{d}_j \eta_j \langle y_i | \Omega_1 | y_j \rangle \\ \end{align}

Because the system state $| \psi \rangle$ can be any, there are $\Omega_1$ are $\Omega_2$ such that $\langle \Omega_2 \Omega_1 \rangle_{\psi} \neq \langle \Omega_1 \Omega_2 \rangle_{\psi}$, i.e.,

\begin{align} \langle [\Omega_1, \Omega_2] \rangle_{\psi} &= \langle \Omega_2 \Omega_1 - \Omega_1 \Omega_2 \rangle_{\psi} \\ &= \langle \Omega_2 \Omega_1 \rangle_{\psi} - \langle \Omega_1 \Omega_2 \rangle_{\psi} \\ &\neq 0 \end{align}

In this case, there is a limit from measuring both observables accurately.

The commutator quantifies how well the two observables described by the two operations can be measured simultaneously. If commutator is 0, the two observables can be measured simultaneously and the order of measurements, even if they are simultaneous measurements there is an order, will not affect the measurement outcomes.

## Conclusions

It is quite amazing that the Heisenberg’s general uncertainty principle could be derived in such a simple way. Unfortunately, the my college physics course instructor never showed a proof for this important principle.

Lei Mao

05-04-2020

04-15-2023