Matrix Exponential
Introduction
Similar to exponential function $f(x) = \exp{x}$, matrix exponential $f(X) = \exp{X}$ also has different form of definitions but all of the definitions are mathematically equivalent.
Previously, we have discussed this topic for exponential function. In this blog post, I would like to discuss the same topic for matrix exponential.
Prerequisites
Matrix Norm
Given a field $K$ of either real or complex numbers, let $K^{m \times n}$ be the $K$-vector space of matrices with $m$ rows and $n$ columns and entries in the field $K$. A matrix norm is a norm on $K^{m \times n}$ and a function $\lVert \cdot \rVert : K^{m \times n} \rightarrow \mathbb{R}$ that must satisfy the properties.
For all scalars $\alpha \in K$ and matrices $A,B\in K^{m\times n}$,
- $\lVert A \rVert \geq 0$ (positive-valued)
- $\lVert A \rVert = 0 \iff A = 0$ (definite)
- $\lVert \alpha A^n \rVert \leq \lvert \alpha \rvert \lVert A \rVert^n$ (absolutely homogeneous)
- $\lVert A + B \rVert \leq \lVert A \rVert + \lVert B \rVert$ (sub-additive or satisfying the triangle inequality)
Matrix norms are particularly useful if they are also sub-multiplicative:
- $\lVert AB \rVert \leq \lVert A \rVert \lVert B \rVert$ (sub-multiplicative)
But it is not a necessary requirement for matrix norm.
An example of the matrix norm that also satisfies the sub-multiplicative property is Frobenius norm.
$$
\begin{align}
\lVert A \rVert_{F} &= \sqrt{\sum_{i=1}^{m} \sum_{i=1}^{n} \lvert a_{i,j}\rvert^2} \\
&= \sqrt{\text{trace}(A^{\dagger}A)} \\
\end{align}
$$
Power Matrix Series Convergence
If the analytic function $f$ has the Taylor expansion
$$
\begin{align}
f(x) &= \sum_{i=0}^{\infty} c_i x^i \\
&= c_0 + c_1 x + c_2 x^2 + \cdots \\
\end{align}
$$
Then a matrix function $A \mapsto f(A)$ can be defined by substituting $x$ by a square matrix: powers become matrix powers, additions become matrix sums and multiplications by coefficients become scalar multiplications. If the series converges for $\lvert x \rvert < r$, then the corresponding matrix series
$$
\begin{align}
f(A) &= \sum_{i=0}^{\infty} c_i A^i \\
&= c_0 + c_1 A + c_2 A^2 + \cdots \\
\end{align}
$$
converges for matrices $A$ such that $\lVert A \rVert < r$ for some matrix norm that satisfies the sub-multiplicative property $\lVert AB \rVert \leq \lVert A \rVert \lVert B \rVert$.
This can be proved by contradiction.
Proof
Because of the sub-multiplicative property $\lVert AB \rVert \leq \lVert A \rVert \lVert B \rVert$, we must have
$$
\lVert A^n \rVert \leq \lVert A \rVert^n
$$
We further take the limit on both side
$$
\lim_{n \rightarrow \infty} \sup \lVert A^n \rVert \leq \lim_{n \rightarrow \infty} \sup \lVert A \rVert^n
$$
$\lim_{n \rightarrow \infty} \sup$ was used for $\lVert A^n \rVert$ because we don’t know if $\lVert A^n \rVert$ has a limit. However, because $f(\lvert x \rvert)$ converges for $\lvert x \rvert < r$ and $\lVert A \rVert < r$, $f(\lVert A \rVert)$ must converge. So we must have
$$
\lim_{n \rightarrow \infty} c_n \lVert A \rVert^n = 0
$$
and
$$
\lim_{n \rightarrow \infty} \sup \lVert A \rVert^n = \lim_{n \rightarrow \infty} \lVert A \rVert^n = 0
$$
Because
$$
\lim_{n \rightarrow \infty} \sup \lVert A^n \rVert \leq \lim_{n \rightarrow \infty} \sup \lVert A \rVert^n = 0
$$
and with the basic property of matrix norm positive-valued
$$
\lim_{n \rightarrow \infty} \inf \lVert A^n \rVert \geq 0
$$
$\lVert A^n \rVert$ must have a limit and
$$
\lim_{n \rightarrow \infty} \lVert A^n \rVert = 0
$$
With the basic property of matrix norm definite, we must have
$$
\lim_{n \rightarrow \infty} A^n = 0
$$
Suppose the matrix series $\sum_{i=0}^{\infty} c_i A^i$ does not converge, we will have
$$
\lim_{n \rightarrow \infty} c_n A^n \neq 0
$$
and further
$$
\lim_{n \rightarrow \infty} A^n \neq 0
$$
This raised the contradiction.
Therefore, the matrix series $\sum_{i=0}^{\infty} c_i A^i$ must converge.
Matrix Exponential Definitions
There are two common definitions for matrix exponential, including the series definition and the limit definition. Our goal is to prove the equivalence between the two definitions.
Series Definition
For a real or complex matrix $X \in \mathbb{C}^{n \times n}$, the exponential function $\exp(X) = e^X$ is defined as follows.
$$
\begin{align}
\exp{X} &= \sum_{n=0}^{\infty} \frac{X^n}{n!} \\
&= I + X + \frac{X^2}{2!} + \frac{X^3}{3!} + \cdots \\
\end{align}
$$
Limit Definition
For a real or complex matrix $X \in \mathbb{C}^{n \times n}$, the exponential function $\exp(X) = e^X$ is defined as follows.
$$
\begin{align}
\exp{X} &= \lim_{n \rightarrow \infty} \Big( I + \frac{X}{n} \Big)^n
\end{align}
$$
Equivalence of Definitions
Proof
The majority of the proof is the same as the proof for exponential function that I derived previously. The subtle difference is at the convergence of the matrix exponential series.
Given real or complex number $x$, we could apply binomial expansion.
$$
\begin{align}
\Big( I + \frac{X}{n} \Big)^n &= \sum_{k=0}^{n} {n \choose k} \frac{X^k}{n^k} \\
&= \sum_{k=0}^{n} \frac{n!}{k!(n-k)!} \frac{X^k}{n^k} \\
&= \sum_{k=0}^{n} \frac{n (n-1) \cdots (n - k + 1)}{k!} \frac{X^k}{n^k} \\
&= \sum_{k=0}^{n} \frac{X^k}{k!} \frac{n (n-1) \cdots (n - k + 1)}{n^k} \\
&= \sum_{k=0}^{n} \frac{X^k}{k!} \Big(1 - \frac{1}{n}\Big) \Big(1 - \frac{2}{n}\Big) \cdots \Big(1 - \frac{k - 1}{n}\Big) \\
\end{align}
$$
We define
$$
u_k(n) =
\begin{cases}
\frac{X^k}{k!} \Big(1 - \frac{1}{n}\Big) \Big(1 - \frac{2}{n}\Big) \cdots \Big(1 - \frac{k - 1}{n}\Big) & \text{if $k \leq n$} \\
0 & \text{if $k > n$} \\
\end{cases}
$$
Let’s see if $\lim_{n \rightarrow \infty} u_k(n)$ exists for every possible $k$.
When $k > n$, it is obvious that $\lim_{n \rightarrow \infty} u_k(n) = 0$.
When $k \leq n$,
$$
\begin{align}
\lim_{n \rightarrow \infty} u_k(n) &= \lim_{n \rightarrow \infty} \bigg[ \frac{X^k}{k!} \Big(1 - \frac{1}{n}\Big) \Big(1 - \frac{2}{n}\Big) \cdots \Big(1 - \frac{k - 1}{n}\Big) \bigg] \\
&= \bigg(\lim_{n \rightarrow \infty} \frac{x^k}{k!} \bigg) \bigg(\lim_{n \rightarrow \infty} \Big(1 - \frac{1}{n}\Big) \bigg) \bigg(\lim_{n \rightarrow \infty} \Big(1 - \frac{2}{n}\Big) \bigg) \cdots \bigg(\lim_{n \rightarrow \infty} \Big(1 - \frac{k-1}{n}\Big) \bigg)\\
&= \lim_{n \rightarrow \infty} \frac{X^k}{k!}
\end{align}
$$
So the question becomes if $\lim_{n \rightarrow \infty} \frac{X^k}{k!}$ exists for every possible $k \leq n$. For example, does $\lim_{n \rightarrow \infty} \frac{X^k}{k!}$ exist when $k = n, n - 1, n - 2, \cdots$?
Let $\lVert \cdot \rVert$ be the matrix norm that also satisfies the sub-multiplicative property, i.e., $\lVert AB \rVert \leq \lVert A \rVert \lVert B \rVert$. We already know the exponential function $\exp^{\lVert X \rVert} = \sum_{i=0}^{\infty}{\frac{\lVert X \rVert^k}{k!}}$ converges. The according to the power matrix series convergence property we showed in the prerequisites, the matrix exponential $\exp^{X} = \sum_{i=0}^{\infty}{\frac{X^k}{k!}}$ must also converge.
Therefore, the limit $\lim_{n \rightarrow \infty} \frac{x^k}{k!}$ must exist for every possible $k \leq n$.
$$
\begin{align}
\lim_{n \rightarrow \infty} u_k(n)
&= \lim_{n \rightarrow \infty} \frac{X^k}{k!} \\
&= \frac{X^k}{k!} \\
\end{align}
$$
Therefore,
$$
\begin{align}
\lim_{n \rightarrow \infty}\Big( I + \frac{X}{n} \Big)^n
&= \lim_{n \rightarrow \infty} \sum_{k=0}^{n} \frac{X^k}{k!} \Big(1 - \frac{1}{n}\Big) \Big(1 - \frac{2}{n}\Big) \cdots \Big(1 - \frac{k - 1}{n}\Big) \\
&= \lim_{n \rightarrow \infty} \sum_{k=0}^{\infty} u_k(n) \\
&= \sum_{k=0}^{\infty} \lim_{n \rightarrow \infty} u_k(n) \\
&= \sum_{k=0}^{\infty} \frac{X^k}{k!} \\
&= \sum_{n=0}^{\infty} \frac{X^n}{n!} \\
\end{align}
$$
This concludes the proof.
References
Matrix Exponential