Interpolation

Introduction

Interpolation is a method of constructing new data points within the range of a discrete set of known data points. It’s widely used in signal processing and computer vision. For example, if we enlarge an image, the new pixel values are usually interpolated from the existing nearby pixel values, and this is called two-dimensional interpolation.

In fact, interpolation can be used in any dimensional space. In this article, I would like to discuss the mathematics of common interpolation methods, including nearest neighbor interpolation, linear interpolation, and cubic interpolation, and how to do interpolation in any dimensional space.

One-Dimensional Interpolation

Any $N$-dimensional interpolation is extended from one-dimensional interpolation. We will understand some of the most widely used one-dimensional interpolations first, including nearest neighbor interpolation, linear interpolation, and cubic interpolation.

Given a one-dimensional array $D \in \mathbb{R}^{d}$, and an inquiry index $x \in \mathbb{R}$ which is usually in a range of $[0, d-1]$, one-dimensional interpolation is a method to compute a value corresponding to the inquiry index $x$ so that when the interpolated value is added to the original one-dimensional array $D$ the array become smoothed.

Nearest Neighbor Interpolation

Given an floating point inquiry index $x$, its nearest neighbor interpolated value is the value corresponding to its closest integer index which can be computed using $\text{round}(x)$. Because the values corresponding to the two integer indices around $x$ can be extremely different, the nearest neighbor interpolation is therefore extremely sensitive to the rounding mode being used especially when the floating point index happens to be in the middle of two adjacent integer indices.


It should also be noted that the nearest neighbor interpolated value $y$ can also be expressed using a dot product.

$$
\begin{align}
y &=
W(x)^{\top}
\begin{bmatrix}
f_{0} \\
f_{1} \\
\end{bmatrix} \\
&=
\begin{bmatrix}
w_1 & w_2 \\
\end{bmatrix}
\begin{bmatrix}
f(\lfloor x \rfloor) \\
f(\lfloor x \rfloor + 1) \\
\end{bmatrix} \\
&= w_1 f(\lfloor x \rfloor) + w_2 f(\lfloor x \rfloor + 1)
\end{align}
$$

where the weights $w_1$ and $w_2$ are a function of the inquiry index $x$ in which one of the $w_1$ and $w_2$ is $1$ and the other one is $0$, depending on the rounding mode.

For example, if the rounding mode is “round to the nearest downward” (“rounding half down”), the weights could be computed using the following formula.

$$
\begin{align}
W(x)
&=
\begin{bmatrix}
w_1 \\
w_2 \\
\end{bmatrix} \\
&=
\begin{bmatrix}
(x - \lfloor x \rfloor) \leq 0.5 \\
(x - \lfloor x \rfloor) > 0.5 \\
\end{bmatrix} \\
\end{align}
$$

Sometimes, we define $t = x - \lfloor x \rfloor$, the weights could be computed using the following equivalent formula instead.

$$
\begin{align}
W(t)
&=
\begin{bmatrix}
w_1 \\
w_2 \\
\end{bmatrix} \\
&=
\begin{bmatrix}
t \leq 0.5 \\
t > 0.5 \\
\end{bmatrix} \\
\end{align}
$$

Linear Interpolation

Given an floating point inquiry index $x$, its linear interpolated value is a linear function depending on $x$ which is a weighted sum of the values corresponding to its closest integer indices $\lfloor x \rfloor$ and $\lfloor x \rfloor + 1$. Similar to the nearest neighbor weighted sum formula, the linear interpolated value $y$ can be expressed as a dot product.

$$
\begin{align}
y &=
W(x)^{\top}
\begin{bmatrix}
f_{0} \\
f_{1} \\
\end{bmatrix} \\
&=
\begin{bmatrix}
w_1 & w_2 \\
\end{bmatrix}
\begin{bmatrix}
f(\lfloor x \rfloor) \\
f(\lfloor x \rfloor + 1) \\
\end{bmatrix} \\
&= w_1 f(\lfloor x \rfloor) + w_2 f(\lfloor x \rfloor + 1)
\end{align}
$$

where

$$
\begin{align}
W(x)
&=
\begin{bmatrix}
w_1 \\
w_2 \\
\end{bmatrix} \\
&=
\begin{bmatrix}
1 - (x - \lfloor x \rfloor) \\
x - \lfloor x \rfloor \\
\end{bmatrix} \\
\end{align}
$$

If we define $t = x - \lfloor x \rfloor$, the weights could be computed using the following equivalent formula instead.

$$
\begin{align}
W(t)
&=
\begin{bmatrix}
w_1 \\
w_2 \\
\end{bmatrix} \\
&=
\begin{bmatrix}
1 - t \\
t \\
\end{bmatrix} \\
\end{align}
$$

In fact, the weights $w_1$ and $w_2$ can be derived theoretically. Suppose $(\lfloor x \rfloor, f_0)$ and $(\lfloor x \rfloor + 1, f_1)$ forms a line, the line can be expressed as $y = kx + b$ where

$$
\begin{align}
k &= \frac{f_1 - f_0}{(\lfloor x \rfloor + 1) - \lfloor x \rfloor} \\
&= f_1 - f_0 \\
\end{align}
$$

$$
\begin{align}
b = f_0 - (f_1 - f_0) \lfloor x \rfloor
\end{align}
$$

Therefore,

$$
\begin{align}
y &= (f_1 - f_0) x + f_0 - (f_1 - f_0) \lfloor x \rfloor \\
&= (f_1 - f_0) x + f_0 - (f_1 - f_0) \lfloor x \rfloor \\
&= \left(1 - \left(x - \lfloor x \rfloor \right)\right) f_0 + (x - \lfloor x \rfloor) f_1 \\
&= w_1 f(\lfloor x \rfloor) + w_2 f(\lfloor x \rfloor + 1)
\end{align}
$$

Cubic Interpolation

Given an floating point inquiry index $x$, its cubic interpolated value is a cubic function depending on $x$. Remember, the cubic function has the following form

$$
y = a x^3 + b x^2 + cx + d
$$

where $y$ is only defined for $x$ between $\lfloor x \rfloor$ and $\lfloor x \rfloor + 1$.

To make derivation simpler, we define $t = x - \lfloor x \rfloor$,

$$
y = a t^3 + b t^2 + ct + d
$$

where $y$ is only defined for $t$ between $0$ and $1$.

This function contains four parameters. But the function only goes through $(0, f_0)$ and $(1, f_1)$, as $y$ is only defined for $t$ between $0$ and $1$. We will need additional constraints to determine the four parameters. The additional constrains are the derivatives of the function at $0$ and $1$.

$$
\frac{dy}{dt} = 3a t^2 + 2bt + c
$$

Concretely,

$$
\begin{align}
f_0 &= d \\
f_1 &= a + b + c + d \\
f_0^{\prime} &= c \\
f_1^{\prime} &= 3a + 2b + c \\
\end{align}
$$

But wait, we don’t know $f_0^{\prime}$ and $f_1^{\prime}$ yet. In practice, $f_0^{\prime}$ and $f_1^{\prime}$ are approximated using the following formula.

$$
\begin{align}
f_0^{\prime} &= \frac{f_1 - f_{-1}}{2} \\
f_1^{\prime} &= \frac{f_2 - f_{0}}{2} \\
\end{align}
$$

This means, we would need four points to determine the a cubic function that goes through the two points $(0, f_0)$ and $(1, f_1)$, but not four points $(-1, f_{-1})$, $(0, f_0)$, $(1, f_1)$ and $(2, f_2)$.

We could solve the four equations and get the values of parameters $a$, $b$, $c$, $d$. Specifically,

$$
\begin{align}
\begin{bmatrix}
f_{0} \\
f_{1} \\
f_{0}^{\prime} \\
f_{1}^{\prime} \\
\end{bmatrix}
&=
\begin{bmatrix}
t_{0}^0 & t_{0}^1 & t_{0}^2 & t_{0}^3 \\
t_{1}^0 & t_{1}^1 & t_{1}^2 & t_{1}^3 \\
0 & t_{0}^0 & 2t_{0}^1 & 3t_{0}^2 \\
0 & t_{1}^0 & 2t_{1}^1 & 3t_{1}^2 \\
\end{bmatrix}
\begin{bmatrix}
d \\
c \\
b \\
a \\
\end{bmatrix} \\
&=
\begin{bmatrix}
1 & 0 & 0 & 0 \\
1 & 1 & 1 & 1 \\
0 & 1 & 0 & 0 \\
0 & 1 & 2 & 3 \\
\end{bmatrix}
\begin{bmatrix}
d \\
c \\
b \\
a \\
\end{bmatrix} \\
\end{align}
$$

$$
\begin{align}
\begin{bmatrix}
f_{0} \\
f_{1} \\
f_{0}^{\prime} \\
f_{1}^{\prime} \\
\end{bmatrix}
&=
\begin{bmatrix}
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
-\frac{1}{2} & 0 & \frac{1}{2} & 0 \\
0 & -\frac{1}{2} & 0 & \frac{1}{2} \\
\end{bmatrix}
\begin{bmatrix}
f_{-1} \\
f_{0} \\
f_{1} \\
f_{2} \\
\end{bmatrix}
\end{align}
$$

Thus,

$$
\begin{align}
\begin{bmatrix}
d \\
c \\
b \\
a \\
\end{bmatrix}
&=
\begin{bmatrix}
1 & 0 & 0 & 0 \\
1 & 1 & 1 & 1 \\
0 & 1 & 0 & 0 \\
0 & 1 & 2 & 3 \\
\end{bmatrix}^{-1}
\begin{bmatrix}
f_{0} \\
f_{1} \\
f_{0}^{\prime} \\
f_{1}^{\prime} \\
\end{bmatrix} \\
&=
\begin{bmatrix}
1 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 \\
-3 & 3 & -2 & -1 \\
-2 & -2 & 1 & 1 \\
\end{bmatrix}
\begin{bmatrix}
f_{0} \\
f_{1} \\
f_{0}^{\prime} \\
f_{1}^{\prime} \\
\end{bmatrix} \\
&=
\begin{bmatrix}
1 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 \\
-3 & 3 & -2 & -1 \\
-2 & -2 & 1 & 1 \\
\end{bmatrix}
\begin{bmatrix}
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
-\frac{1}{2} & 0 & \frac{1}{2} & 0 \\
0 & -\frac{1}{2} & 0 & \frac{1}{2} \\
\end{bmatrix}
\begin{bmatrix}
f_{-1} \\
f_{0} \\
f_{1} \\
f_{2} \\
\end{bmatrix} \\
&=
\begin{bmatrix}
0 & 1 & 0 & 0 \\
-0.5 & 0 & 0.5 & 0 \\
1.0 & -2.5 & 2 & -0.5 \\
-0.5 & 1.5 & -1.5 & 0.5 \\
\end{bmatrix}
\begin{bmatrix}
f_{-1} \\
f_{0} \\
f_{1} \\
f_{2} \\
\end{bmatrix} \\
\end{align} \\
$$

Therefore,

$$
\begin{align}
y
&=
\begin{bmatrix}
1 & t & t^2 & t^3 \\
\end{bmatrix}
\begin{bmatrix}
d \\
c \\
b \\
a \\
\end{bmatrix} \\
&=
\begin{bmatrix}
1 & t & t^2 & t^3 \\
\end{bmatrix}
\begin{bmatrix}
0 & 1 & 0 & 0 \\
-0.5 & 0 & 0.5 & 0 \\
1.0 & -2.5 & 2 & -0.5 \\
-0.5 & 1.5 & -1.5 & 0.5 \\
\end{bmatrix}
\begin{bmatrix}
f_{-1} \\
f_{0} \\
f_{1} \\
f_{2} \\
\end{bmatrix} \\
&=
W(t)^{\top}
\begin{bmatrix}
f_{-1} \\
f_{0} \\
f_{1} \\
f_{2} \\
\end{bmatrix} \\
\end{align}
$$

where

$$
\begin{align}
W(t) &=
\begin{bmatrix}
0 & 1 & 0 & 0 \\
-0.5 & 0 & 0.5 & 0 \\
1.0 & -2.5 & 2 & -0.5 \\
-0.5 & 1.5 & -1.5 & 0.5 \\
\end{bmatrix}^{\top}
\begin{bmatrix}
1 & t & t^2 & t^3 \\
\end{bmatrix}^{\top}
\end{align}
$$

We could see that the cubic interpolation can also be expressed as an dot product in the same fashion as nearest neighbor and linear interpolations.

Cubic Interpolation Convolution

There is a cubic interpolation convolution algorithm that we would often see in the cubic interpolation implementations which was originally published in the paper “Cubic Convolution Interpolation for Digital Image Processing”.

The convolution filter weights was formulated as follows.

$$
\begin{align}
W(x)
&=
\begin{cases}
(a + 2) \lvert x \rvert^3 - (a + 3) \lvert x \rvert^2 + 1 & \text{for $\lvert x \rvert \leq 1$} \\
a \lvert x \rvert^3 - 5a \lvert x \rvert^2 + 8a \lvert x \rvert - 4a & \text{for 1 < $\lvert x \rvert < 2$} \\
0 & \text{otherwise} \\
\end{cases} \\
\end{align}
$$

For each inquiry index $x$ two samples are located on its left and two samples on the right. These points are indexed from −1 to 2 as $f_{-1}$, $f_{0}$, $f_{1}$, $f_{2}$. The distance from the point indexed with 0 to the inquiry index is denoted by $t$ here, i.e., $t = x - \lfloor x \rfloor$ and $t \in [0, 1)$. The convolution filter weights can be reformulated as follows.

$$
\begin{align}
W(t)
&=
\begin{bmatrix}
a (t + 1)^3 - 5a (t + 1)^2 + 8a (t + 1) - 4a \\
(a + 2) t^3 - (a + 3) t^2 + 1 \\
(a + 2) (1-t)^3 - (a + 3) (1-t)^2 + 1 \\
a (2 - t)^3 - 5a (2 - t)^2 + 8a (2 - t) - 4a \\
\end{bmatrix} \\
&=
\begin{bmatrix}
a (t^3 + 3t^2 + 3t + 1) - 5a (t^2 + 2t + 1) + 8a (t + 1) - 4a \\
(a + 2) t^3 - (a + 3) t^2 + 1 \\
(a + 2) (1 - 3t + 3t^2 -t^3) - (a + 3) (1 - 2t + t^2) + 1 \\
a (8 - 12t + 6t^2 - t^3) - 5a (4 - 4t + t^2) + 8a (2 - t) - 4a \\
\end{bmatrix} \\
&=
\begin{bmatrix}
a t^3 - 2at^2 + at \\
(a + 2) t^3 - (a + 3) t^2 + 1 \\
- (a + 2) t^3 + (2a + 3)t^2 - at \\
-at^3 + at^2 \\
\end{bmatrix} \\
&=
\begin{bmatrix}
a t^3 - 2at^2 + at \\
(a + 2) t^3 - (a + 3) t^2 + 1 \\
- (a + 2) t^3 + (2a + 3)t^2 - at \\
-at^3 + at^2 \\
\end{bmatrix} \\
&=
\begin{bmatrix}
0 & 1 & 0 & 0 \\
a & 0 & -a & 0 \\
-2a & -(a+3) & (2a+3) & a \\
a & (a+2) & -(a+2) & -a \\
\end{bmatrix}^{\top}
\begin{bmatrix}
1 & t & t^2 & t^3 \\
\end{bmatrix}^{\top} \\
\end{align}
$$

This actually looks very familiar, right? When $a = 0.5$, the weights are exactly the same as the weights we derived for cubic interpolation. In practice, different implementations might use different values for $a$. For example, $a$ is usually set to $-0.5$ for PIL and $-0.75$ for OpenCV.

A very common implementation for computing cubic interpolation coefficients is as follows.

$$
\begin{align}
W(t)
&=
\begin{bmatrix}
a (t + 1)^3 - 5a (t + 1)^2 + 8a (t + 1) - 4a \\
(a + 2) t^3 - (a + 3) t^2 + 1 \\
(a + 2) (1-t)^3 - (a + 3) (1-t)^2 + 1 \\
a (2 - t)^3 - 5a (2 - t)^2 + 8a (2 - t) - 4a \\
\end{bmatrix} \\
&=
\begin{bmatrix}
\left( \left( a (t + 1) - 5a \right) \left(t + 1 \right) + 8a \right) \left(t + 1\right) - 4a \\
\left( \left(a + 2\right) t - \left(a + 3\right)\right) t^2 + 1 \\
\left(\left(a + 2\right) \left(1-t\right) - \left(a + 3\right)\right) (1-t)^2 + 1 \\
\left(\left( a \left(2 - t\right) - 5a \right) \left(2 - t\right) + 8a\right) \left(2 - t\right) - 4a \\
\end{bmatrix} \\
\end{align}
$$

For example, the ONNX resize cubic interpolation implementation uses this particular formula.

The interpolated value is therefore the dot product between the kernel parameters and the adjacent values.

$$
\begin{align}
y
&= W(t)^{\top}
\begin{bmatrix}
f_{-1} \\
f_{0} \\
f_{1} \\
f_{2} \\
\end{bmatrix} \\
&=
\begin{bmatrix}
1 & t & t^2 & t^3 \\
\end{bmatrix}
\begin{bmatrix}
0 & 1 & 0 & 0 \\
a & 0 & -a & 0 \\
-2a & -(a+3) & (2a+3) & a \\
a & (a+2) & -(a+2) & -a \\
\end{bmatrix}
\begin{bmatrix}
f_{-1} \\
f_{0} \\
f_{1} \\
f_{2} \\
\end{bmatrix} \\
\end{align}
$$

Cubic Interpolation VS Cubic Spline Interpolation

In fact, strictly speaking, the cubic interpolation we introduced above is cubic spline interpolation. In computer vision, it’s used more often than the cubic interpolation that it even took the name from vanilla cubic interpolation.

Unlike cubic spline interpolation, the vanilla cubic interpolation enforces going through four points, $(-1, f_{-1})$, $(0, f_{0})$, $(1, f_{1})$, and $(2, f_{2})$.

Given $t = x - \lfloor x \rfloor$, the cubic function has the following form which is the same as the one in cubic spline interpolation.

$$
y = a t^3 + b t^2 + ct + d
$$

The four equations to solve are the followings instead.

$$
\begin{align}
\begin{bmatrix}
f_{-1} \\
f_{0} \\
f_{1} \\
f_{2} \\
\end{bmatrix}
&=
\begin{bmatrix}
t_{-1}^0 & t_{-1}^1 & t_{-1}^2 & t_{-1}^3 \\
t_{0}^0 & t_{0}^1 & t_{0}^2 & t_{0}^3 \\
t_{1}^0 & t_{1}^1 & t_{1}^2 & t_{1}^3 \\
t_{2}^0 & t_{2}^1 & t_{2}^2 & t_{2}^3 \\
\end{bmatrix}
\begin{bmatrix}
d \\
c \\
b \\
a \\
\end{bmatrix} \\
&=
\begin{bmatrix}
1 & -1 & 1 & -1 \\
1 & 0 & 0 & 0 \\
1 & 1 & 1 & 1 \\
1 & 2 & 4 & 8 \\
\end{bmatrix}
\begin{bmatrix}
d \\
c \\
b \\
a \\
\end{bmatrix} \\
\end{align}
$$

Thus,

$$
\begin{align}
\begin{bmatrix}
d \\
c \\
b \\
a \\
\end{bmatrix}
&=
\begin{bmatrix}
1 & -1 & 1 & -1 \\
1 & 0 & 0 & 0 \\
1 & 1 & 1 & 1 \\
1 & 2 & 4 & 8 \\
\end{bmatrix}^{-1}
\begin{bmatrix}
f_{-1} \\
f_{0} \\
f_{1} \\
f_{2} \\
\end{bmatrix} \\
&=
\begin{bmatrix}
0 & 1 & 0 & 0 \\
-\frac{1}{3} & -\frac{1}{2} & 1 & -\frac{1}{6} \\
\frac{1}{2} & -1 & \frac{1}{2} & 0 \\
-\frac{1}{6} & \frac{1}{2} & -\frac{1}{2} & \frac{1}{6} \\
\end{bmatrix}
\begin{bmatrix}
f_{-1} \\
f_{0} \\
f_{1} \\
f_{2} \\
\end{bmatrix} \\
\end{align}
$$

Therefore,

$$
\begin{align}
y
&=
\begin{bmatrix}
1 & t & t^2 & t^3 \\
\end{bmatrix}
\begin{bmatrix}
d \\
c \\
b \\
a \\
\end{bmatrix} \\
&=
\begin{bmatrix}
1 & t & t^2 & t^3 \\
\end{bmatrix}
\begin{bmatrix}
0 & 1 & 0 & 0 \\
-\frac{1}{3} & -\frac{1}{2} & 1 & -\frac{1}{6} \\
\frac{1}{2} & -1 & \frac{1}{2} & 0 \\
-\frac{1}{6} & \frac{1}{2} & -\frac{1}{2} & \frac{1}{6} \\
\end{bmatrix}
\begin{bmatrix}
f_{-1} \\
f_{0} \\
f_{1} \\
f_{2} \\
\end{bmatrix} \\
&= W(t)^{\top}
\begin{bmatrix}
f_{-1} \\
f_{0} \\
f_{1} \\
f_{2} \\
\end{bmatrix} \\
\end{align}
$$

where

$$
\begin{align}
W(t) &=
\begin{bmatrix}
0 & 1 & 0 & 0 \\
-\frac{1}{3} & -\frac{1}{2} & 1 & -\frac{1}{6} \\
\frac{1}{2} & -1 & \frac{1}{2} & 0 \\
-\frac{1}{6} & \frac{1}{2} & -\frac{1}{2} & \frac{1}{6} \\
\end{bmatrix}^{\top}
\begin{bmatrix}
1 & t & t^2 & t^3 \\
\end{bmatrix}^{\top}
\end{align}
$$

$N$-Dimensional Interpolation

Given a $N$-dimensional tensor $D \in \mathbb{R}^{d_1 \times d_2 \times \cdots \times d_N}$ and an inquiry index $x \in \mathbb{R}^{N}$, $N$-dimensional interpolation is a method to compute a value corresponding to the inquiry index $x$ so that when the interpolated value is added to the original $N$-dimensional tensor $D$ the tensor become smoothed.

For example, the following figure shows the comparison between one-dimensional interpolation and two-dimensional interpolation.

1D Interpolation VS 2D Interpolation

Notice from the figure above, to do one-dimensional interpolation for a given inquiry index, we would need $2$ adjacent points for nearest neighbor and linear interpolations and $4$ adjacent points for cubic interpolation, whereas to do two-dimensional interpolation for a given inquiry index, we would need $2^2 = 4$ adjacent points for nearest neighbor and linear interpolations and $4^2 = 16$ adjacent points for cubic interpolation.

From $N$-Dimensional Interpolation to One-Dimensional Interpolation

The $N$-dimensional interpolation can be decomposed into a sequence of one-dimensional interpolations actually. For example, suppose we have a 3D tensor $D \in \mathbb{R}^{2 \times 3 \times 4}$ and the inquiry index $x = (1.3, 2.4, 0.7)$. We could start the one-dimensional interpolations from the last dimension. The inquiry index for the last dimension is $0.7$, after the one-dimensional interpolation, the resulted interpolated tensor is a 2D tensor $D^{\prime} \in \mathbb{R}^{2 \times 3}$. We further run one-dimensional interpolations for the last dimension and the inquiry index is $2.4$. The resulted tensor is a 1D tensor $D^{\prime\prime} \in \mathbb{R}^{2}$. Finally, we run the last one-dimensional interpolation and the inquiry index is $1.3$. The final interpolated value is a scalar $y \in \mathbb{R}$.

There is an interesting phenomenon. Given an one-dimensional interpolation method which satisfies some constraints, when generalized to the $N$-dimensional interpolation, the order of the one-dimensional interpolations onto each of the dimensions does not matter.

We will still use the 3D tensor $D \in \mathbb{R}^{2 \times 3 \times 4}$ and the inquiry index $x = (1.3, 2.4, 0.7)$ as an example. We can interpolate from right to left, the resulting interpolated tensor shape is $\mathbb{R}^{2 \times 3 \times 4} \rightarrow \mathbb{R}^{2 \times 3} \rightarrow \mathbb{R}^{2} \rightarrow \mathbb{R}$. We can also interpolate from left to right, $\mathbb{R}^{2 \times 3 \times 4} \rightarrow \mathbb{R}^{3 \times 4} \rightarrow \mathbb{R}^{4} \rightarrow \mathbb{R}$. We can also interpolate the middle dimension first, followed by the left dimension and the right dimension, $\mathbb{R}^{2 \times 3 \times 4} \rightarrow \mathbb{R}^{2 \times 4} \rightarrow \mathbb{R}^{4} \rightarrow \mathbb{R}$.

We will show a mathematical proof that the interpolation order does not matter if the one-dimensional interpolation can satisfy some constraints.

Proof

The first step is to show that the interpolation order does not matter for 2D interpolation under some constraints if there is any. That is to say, no matter whether we interpolate along the height dimension first or the width dimension first, the interpolated values are always the same.

Suppose the inquiry index $x = (x_1, x_2)$, the valid matrix for interpolation is always a submatrix of the original matrix that has a shape $(K, K)$. For nearest neighbor and linear interpolation, $K = 2$. For cubic interpolation, $K = 4$.

$$
\begin{align}
A &=
\begin{bmatrix}
A_{1,1} & A_{1,2} & \cdots & A_{1,K} \\
A_{2,1} & A_{2,2} & \cdots & A_{2,K} \\
\vdots & \vdots & \ddots & \vdots \\
A_{K,1} & A_{K,2} & \cdots & A_{K,K} \\
\end{bmatrix} \\
\end{align}
$$

Suppose the 1D interpolation function is $f$.

The interpolated value for interpolating the width first followed by interpolating the height is

$$
y_1 =
f(
f(A_{1,1}, A_{1,2}, \cdots, A_{1,K}, x_2),
f(A_{2,1}, A_{2,2}, \cdots, A_{2,K}, x_2),
\cdots,
f(A_{K,1}, A_{K,2}, \cdots, A_{K,K}, x_2),
x_1
)
$$

The interpolated value for interpolating the height first followed by interpolating the width is

$$
y_2 =
f(
f(A_{1,1}, A_{2,1}, \cdots, A_{K,1}, x_1),
f(A_{1,2}, A_{2,2}, \cdots, A_{K,2}, x_1),
\cdots,
f(A_{1,K}, A_{2,K}, \cdots, A_{K,K}, x_1),
x_2
)
$$

If the one-dimensional interpolation order does not matter, we will have $y_1 = y_2$. Now the question is, when will $y_1 = y_2$?

One of the possible scenario is when the function $f$ is a dot product that has the following structure.

$$
\begin{align}
f(a_1, a_2, \cdots, a_K, x) &= a_1 g_1(x) + a_2 g_2(x) + \cdots + a_K g_K(x) \\
&= \sum_{i=1}^{K} a_i g_i(x)
\end{align}
$$

It’s not difficult to verify $y_1 = y_2$ in this scenario.

$$
\begin{align}
y_1
&=
f\left(
\sum_{i=1}^{K} A_{1,i} g_i(x_2),
\sum_{i=1}^{K} A_{2,i} g_i(x_2),
\cdots,
\sum_{i=1}^{K} A_{K,i} g_i(x_2),
x_1
\right) \\
&=
\left( \sum_{i=1}^{K} A_{1,i} g_i(x_2) \right) g_1(x_1) + \left( \sum_{i=1}^{K} A_{2,i} g_i(x_2) \right) g_2(x_1) + \cdots + \left( \sum_{i=1}^{K} A_{K,i} g_i(x_2) \right) g_K(x_1) \\
&=
\left( \sum_{i=1}^{K} A_{1,i} g_1(x_1) g_i(x_2) \right) + \left( \sum_{i=1}^{K} A_{2,i} g_2(x_1) g_i(x_2) \right) + \cdots + \left( \sum_{i=1}^{K} A_{K,i} g_K(x_1) g_i(x_2) \right) \\
&= \left( \sum_{i=1}^{K} A_{i,1} g_i(x_1) \right) g_1(x_2) + \left( \sum_{i=1}^{K} A_{i,2} g_i(x_1) \right) g_2(x_2) + \cdots + \left( \sum_{i=1}^{K} A_{i,K} g_i(x_1) \right) g_K(x_2) \\
\end{align}
$$

$$
\begin{align}
y_2
&=
f\left(
\sum_{i=1}^{K} A_{i,1} g_i(x_1),
\sum_{i=1}^{K} A_{i,2} g_i(x_1),
\cdots,
\sum_{i=1}^{K} A_{i,K} g_i(x_1),
x_2
\right) \\
&= \left( \sum_{i=1}^{K} A_{i,1} g_i(x_1) \right) g_1(x_2) + \left( \sum_{i=1}^{K} A_{i,2} g_i(x_1) \right) g_2(x_2) + \cdots + \left( \sum_{i=1}^{K} A_{i,K} g_i(x_1) \right) g_K(x_2) \\
\end{align}
$$

The nearest neighbor, linear, and the cubic interpolation methods we described previously can all be expressed using the above dot product structure, indicating that the interpolation dimension order does not matter for those interpolation methods for 2D interpolation.

Next, we will show that for $N$-dimensional tensor, using the interpolation methods that satisfies the constraints above, for any two interpolation orders, the interpolated values are also always the same.

This is easy to show. Given any two sequence, it is completely feasible to convert one sequence to another by permuting two values in the sequence finite times. For example, suppose the two sequences are $\{1, 3, 2, 4\}$ and $\{4, 1, 3, 2\}$ which represent the interpolation order of the 4 dimensions in the 4-D tensor. Then we could turn $\{4, 1, 3, 2\}$ to $\{1, 3, 2, 4\}$ via, for example, the following permutations.

$$
\{4, 1, 3, 2\} \rightarrow \{1, 4, 3, 2\} \rightarrow \{1, 3, 4, 2\} \rightarrow \{1, 3, 2, 4\}
$$

Because we have shown the interpolation order does not matter for 2D interpolation for the interpolation methods under some constraints. The interpolated values for the sequence 1 and 2 must be the same. The interpolated values for the sequence 2 and 3 must be the same. The interpolated values for the sequence 3 and 4 must be the same. Then we could reach the conclusion that the interpolated values for the sequence 1 and 4 must be the same.

This concludes the proof. $\square$

Corner Alignments

Corner Alignment Modes

The “align corners” term is very commonly seen in the image interpolation. Suppose our image only consists of one row of $n$ pixels. Using 0-based indexing, the values of the pixel $0, 1, \cdots, n-1$ are $f_{0}, f_{1}, \cdots, f_{n-1}$, respectively. Pixel interpolation is slightly more complicated than what we have discussed above, because each pixel has certain width. Depending on how we assign the inquiry point $x$ for each pixel, the interpolation can be result can be slightly different.

There are two common ways of assigning the inquiry point $x$ for each pixel, “align left corner” and “align center”.

In the “align left corner” mode, the valid floating point inquiry $x$ range is from $0$ to $n-1$. Any inquiry outside this range must rely on padding or some other assumptions. For example, in the following figure, even though $x = 3.5$ seems to be inside the last pixel, but it is not.

1D Align Left Corner

In the “align center” mode, the valid floating point inquiry $x$ range is from $-0.5$ to $n-0.5$.

1D Align Center

The same behavior could be generalized to higher dimensional pixel spaces as well.

Coordinate Normalization

When it comes to coordinate normalization, suppose the original coordinate range is $[x_{\text{min}}, x_{\text{max}}]$ and the normalized coordinate range is $[x^{\prime}_{\text{min}}, x^{\prime}_{\text{max}}]$. Specifically, the original coordinate range is $[0, n-1]$ for the “align left corner” mode, and the normalized coordinate range is $[-0.5, n-0.5]$ for the “align center” mode, as we mentioned above. There are commonly two normalized coordinate ranges, $[0, 1]$ and $[-1, 1]$.

The mapping between the original coordinate and the normalized coordinate is linear, $x^{\prime} = kx + b$. This coefficients can be calculated by solving the linear equations

$$
\begin{align}
x^{\prime}_{\text{max}} &= k x_{\text{max}} + b \\
x^{\prime}_{\text{min}} &= k x_{\text{min}} + b \\
\end{align}
$$

Therefore,

$$
\begin{align}
k &= \frac{x^{\prime}_{\text{max}} - x^{\prime}_{\text{min}}}{x_{\text{max}} - x_{\text{min}}} \\
b &= \frac{x_{\text{max}} x^{\prime}_{\text{min}} - x_{\text{min}} x^{\prime}_{\text{max}}}{x_{\text{max}} - x_{\text{min}}} \\
\end{align}
$$

$$
x^{\prime} = \frac{x^{\prime}_{\text{max}} - x^{\prime}_{\text{min}}}{x_{\text{max}} - x_{\text{min}}} x + \frac{x_{\text{max}} x^{\prime}_{\text{min}} - x_{\text{min}} x^{\prime}_{\text{max}}}{x_{\text{max}} - x_{\text{min}}}
$$

References

Author

Lei Mao

Posted on

06-07-2023

Updated on

06-07-2023

Licensed under


Comments