Tensor Rank VS Matrix Rank
Introduction
In deep learning software documentations, such as the ONNX operator documentation, we could often see the word “rank”, “tensor rank”, or “the rank of tensor”.
In this blog post, I would like to discuss what tensor rank is and what the connections are between tensor rank and matrix rank.
Prerequisites
Outer Product
Given two vectors $u = \{u_0, u_1, \cdots, u_{m-1}\}$ and $v = \{v_0, v_1, \cdots, v_{n-1}\}$, the outer product of $u$ and $v$, denoted as $u \otimes v$, is defined as
$$
\begin{align}
u \otimes v &=
\begin{bmatrix}
u_0 v_0 & u_0 v_1 & \cdots & u_0 v_{n-1}\\
u_1 v_0 & u_1 v_1 & \cdots & u_1 v_{n-1}\\
\vdots & \vdots & \ddots & \vdots \\
u_{m-1} v_0 & u_{m-1} v_1 & \cdots & u_{m-1} v_{n-1}\\
\end{bmatrix} \nonumber\\
\end{align}
$$
Using index notation,
$$
(u \otimes v)_{i,j} = u_i v_j
$$
If $u$ and $v$ are column vectors, i.e., $u = [u_0, u_1, \cdots, u_{m-1}]^{\top}$ and $v = [v_0, v_1, \cdots, v_{n-1}]^{\top}$, the outer product could also be simply expressed using matrix multiplication.
$$
\begin{align}
u \otimes v &= u v^{\dagger}
\end{align}
$$
The mapping of outer product could be described as
$$
\otimes: \mathbb{C}^{m} \times \mathbb{C}^{n} \rightarrow \mathbb{C}^{m \times n}
$$
Tensor Product
Tensor product is essentially an general case of one dimensional array outer product. It is the outer product of two tensors, namely multidimensional tensors, that could have different dimensionality.
If $A$ is a $k$ dimensional tensor, $A \in \mathbb{C}^{ \prod_{i=0}^{k-1} m_i}$ and $B$ is a $k^{\prime}$ dimensional tensor, $B \in \mathbb{C}^{ \prod_{i=0}^{k^{\prime}-1} n_i}$, the tensor product $A \otimes B$ is a $k + k^{\prime}$ dimensional tensor that would have shape $A \otimes B \in \mathbb{C}^{ \prod_{i=0}^{k-1} m_i \prod_{i=0}^{k^{\prime}-1} n_i}$.
Matrix Rank
In linear algebra, the rank of a matrix $A$ is the dimension of the vector space generated (or spanned) by its columns (or rows). This corresponds to the maximal number of linearly independent columns (or rows) of $A$, or the minimum number of column (or row) vectors needed to span the column (or row) space. The rank of a matrix could usually be computed using Gaussian elimination. Given a matrix $A \in \mathbb{C}^{m \times n}$, the rank of the matrix $\text{rank}(A) \in [0, \min(m, n)]$. This is something that we have learned from the college linear algebra.
Using outer product notations, a matrix has rank one if it can be written as an outer product of two non-zero vectors $u$ and $v$.
$$
A = u \otimes v
$$
The rank of an arbitrary matrix $A$ is the smallest number of such outer products that can be summed to produce it.
$$
A = u_1 \otimes v_1 + u_2 \otimes v_2 + \cdots + u_n \otimes v_n
$$
Tensor Rank
Tensor rank actually has two meanings. One is actually the number of dimensions of the its array representation, and the other one is the natural extension from the matrix rank described above.
Because both meanings have exactly the same name, the user has to distinguish the meanings in different contexts.
Number of Dimensions
The total number of indices required to identify each component uniquely is equal to the dimension of the array, and is called the order, degree or rank of the tensor. This is the definition of tensor rank that are commonly seen in deep learning software. It is sometimes too simple for people to believe it is just the definition. For example, a tensor $X^{N \times C \times H \times W}$ has a rank, degree, or order of 4.
Tensor Rank Extended from Matrix Rank
Similar to a matrix of rank one, a order $k$ tensor of rank one, is a tensor that can be written as an outer product of
$$
A = u_1 \otimes u_2 \otimes \cdots \otimes u_k
$$
The rank of an arbitrary order $k$ tensor $A$ is the smallest number of such outer products that can be summed to produce it.
$$
A = u_{1,1} \otimes u_{1,2} \otimes \cdots \otimes u_{1,k} + u_{2,1} \otimes u_{2,2} \otimes \cdots \otimes u_{2,k} + \cdots + u_{n,1} \otimes u_{n,2} \otimes \cdots \otimes u_{n,k}
$$
Conclusions
Usually, tensor rank just means the number of dimensions. However, if the context is heavily mathematics related, the reader would have to watch out to see if it just means the other thing.
References
Tensor Rank VS Matrix Rank