Lei Mao bio photo

Lei Mao

Machine Learning, Artificial Intelligence, Computer Science.

Twitter Facebook LinkedIn GitHub   G. Scholar E-Mail RSS

Introduction

I am busy preparing a blog post for quantum Shor’s factorization algorithm. Shor’s algorithm is complicated if the reader would like to understand all of its mathematics from scratch. I have created several building-block posts toward Shor’s algorithm, including “Discrete Fourier Transform”, “Euclidean Algorithm”, and “Composite Number Factorization Using Modular Exponentiation Period”. Each of them is self-contained and is a must to fully understand Shor’s algorithm mathematically.


In this blog post, I would like to finish the last building-block post toward Shor’s algorithm which is an interesting mathematical property of the application of discrete Fourier transform on $0/1$ periodic sequences.

Discrete Fourier Transform for 0/1 Periodic Sequences

0/1 Periodic Sequences

The $0/1$ periodic sequence is a periodic sequence in which for each period all the values are $0$ except for one value is $1$. For example, $\{0,0,1,0,0,0,1,0,0,0,1,0\}$ is such kind of periodic sequence whose unique subsequence is $\{0,0,1,0\}$. $\{0,0,1,0\}$, $\{1,0,0,0\}$, $\{0,0,0,1\}$ are also considered to be valid unique subsequences. The period of this sequence is $4$.

Discrete Fourier Transform for 0/1 Periodic Sequences

To see the effect of discrete Fourier transform on $0/1$ periodic sequences, let’s take a look at some examples first. Here I have prepared four real valued $0/1$ periodic sequences, the length of each of them is $N = 2^8 = 256$.

\[\begin{align} f_1 &= \{ \underbrace{ 0,0,1,0,0,0,1,0,\cdots,0,0,1,0}_{N = 2^8}\} \\ f_2 &= \{ \underbrace{ 0,0,1,0,0,1,\cdots,0,0,1,0}_{N = 2^8} \} \\ F_1 &= \{ \underbrace{ 0,0,1,0,0,0,1,0,\cdots,0,0,1,0}_{N = 2^8} \} \\ F_2 &= \{ \underbrace{ 0,0,1,0,0,1,\cdots,0,0,1,0}_{N = 2^8} \} \\ \end{align}\]

Actually they are two sequences since $f_1 = F_1$ and $f_2 = F_2$. I used different notations for same sequences because $f$ would be applied discrete Fourier transformation, whereas $F$ would be applied inverse discrete Fourier transformation. The unique subsequence for $f_1$ and $F_1$ is $\{0,0,1,0\}$, and the unique subsequence for $f_2$ and $F_2$ is $\{0,0,1\}$. Therefore, the period for $f_1$ and $F_1$ is $4$, and the period for $f_2$ and $F_2$ is $3$. Note that the period for $f_1$ and $F_1$ divides $N = 2^8$ whereas the period for $f_2$ and $F_2$ does not.


The sequence could also be complex-numbered. The magnitude $\rho$ of a complex number $a + b i$ is $\sqrt{a^2 + b^2}$. If the complex number is represented using polar coordinates $\rho e^{i\theta}$, the magnitude is just $\rho$ then. So the magnitude of the sequence is represented by $\rho_{f}$ or $\rho_{F}$.


We further define the “normalized probability” for a sequence. The concept of “probability” is less useful in this context, but it will become extremely useful when we discuss quantum computing and Shor’s factorization algorithm. The normalized probability for each element $f[i]$ in the sequence is defined as

\[p_f[i] = \frac{\rho_{f}[i]^2}{\sum_{j=1}^{N}\rho_{f}[j]^2}\]

or

\[p_F[i] = \frac{\rho_{F}[i]^2}{\sum_{j=1}^{N}\rho_{F}[j]^2}\]

If we apply discrete Fourier transform and inverse discrete Fourier transform to such sequence, let’s see what will happen. We will use a Python program for demonstration.

# dft_periodic_01.py
import numpy as np
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt

def plot_probabilities(f_N, F_N, title, filename):

    N = len(f_N)
    assert N == len(F_N)

    f_N_magnitude = np.absolute(f_N)
    f_N_probability = f_N_magnitude ** 2 / np.sum(f_N_magnitude ** 2)

    F_N_magnitude = np.absolute(F_N)
    F_N_probability = F_N_magnitude ** 2 / np.sum(F_N_magnitude ** 2)

    x = np.arange(N)
    y1 = f_N_probability
    y2 = F_N_probability
    fig, ax = plt.subplots(figsize=(6,4))
    ax.stem(x, y1, "b", markerfmt="bo", label="f(n)")
    ax.stem(x, y2, "g", markerfmt="go", label="F(n)")
    ax.legend()
    ax.set_xlabel("n")
    ax.set_ylabel("p")
    ax.set_title(title)
    fig.savefig(filename + ".svg", format="svg", dpi=600, bbox_inches="tight")

def dft(f_N):

    F_N = np.fft.fft(f_N)
    F_N_magnitude = np.absolute(F_N)
    F_N_probability = F_N_magnitude ** 2 / np.sum(F_N_magnitude ** 2)

    return F_N, F_N_magnitude, F_N_probability

def idft(F_N):

    f_N = np.fft.ifft(F_N)
    f_N_magnitude = np.absolute(f_N)
    f_N_probability = f_N_magnitude ** 2 / np.sum(f_N_magnitude ** 2)

    return f_N, f_N_magnitude, f_N_probability

def plot_dft_probabilities(f_N, title, filename):

    F_N = np.fft.fft(f_N)
    plot_probabilities(f_N=f_N, F_N=F_N, title=title, filename=filename)

def plot_inverse_dft_probabilities(F_N, title, filename):

    f_N = np.fft.ifft(F_N)
    plot_probabilities(f_N=f_N, F_N=F_N, title=title, filename=filename)

def generate_periodic_sequence(unique_sequence, N):

    unique_sequence = list(unique_sequence)
    return (unique_sequence * (N // len(unique_sequence) + 1))[:N]

def main():

    n = 8
    N = 2 ** n

    # Period divides sequence length
    unique_sequence_1 = [0,0,1,0]
    # Period does not divide sequence length
    unique_sequence_2 = [0,0,1]

    f1_N = generate_periodic_sequence(unique_sequence=unique_sequence_1, N=N)
    f2_N = generate_periodic_sequence(unique_sequence=unique_sequence_2, N=N)
    plot_dft_probabilities(f_N=f1_N, title="f1 DFT", filename="f1_dft")
    plot_dft_probabilities(f_N=f2_N, title="f2 DFT", filename="f2_dft")

    F1_N = generate_periodic_sequence(unique_sequence=unique_sequence_1, N=N)
    F2_N = generate_periodic_sequence(unique_sequence=unique_sequence_2, N=N)
    plot_inverse_dft_probabilities(F_N=F1_N, title="F1 iDFT", filename="F1_idft")
    plot_inverse_dft_probabilities(F_N=F2_N, title="F2 iDFT", filename="F2_idft")

if __name__ == "__main__":
    
    main()

In fact, the magnitude of the output sequences from discrete Fourier transform and inverse discrete Fourier transform are also perfect $0/1$ periodic sequences for $f_1$ and $F_1$ and almost $0/1$ periodic sequences for $f_2$ and $F_2$. Suppose the period of $0/1$ periodic sequences before discrete Fourier transform or inverse discrete Transform is $r$, then the period of the output sequences is exactly $\frac{N}{r}$ for sequences whose period $r$ divides $N$, such as $f_1$ and $F_1$, and around $\frac{N}{r}$ for sequences whose period $r$ does divide $N$, such as $f_2$ and $F_2$.

Discrete Fourier Transform

The period of $f_1$ is $4$, the period of $f_2$ is $3$, and $N = 2^8 = 256$. We found the period of the magnitude or probability of the output sequence from discrete Fourier transform is $64$ for $f_1$ and $85$ for $f_2$. Note that $64 = \frac{256}{4}$ and $85 \approx \frac{256}{3}$.

Inverse Discrete Fourier Transform

The same effect also holds for inverse discrete Fourier transform.


Therefore, formally, suppose the input $0/1$ periodic sequence has period $r$ and length $N$, after discrete Fourier transform or inverse discrete Fourier transform, the normalized probabilities of the output sequence will have period $r^{\prime}$ exactly equal to $\frac{N}{r}$ if $r$ divides $N$, or $r^{\prime}$ around $\frac{N}{r}$ if $r$ does not divide $N$. The highest normalized probabilities will occur at positions $k \frac{N}{r}$ where $k = 0, 1, \cdots, r-1$. The normalized probabilities for all the rest of the positions will be $0$ if $r$ divides $N$ or almost $0$ if $r$ does not divide $N$.

Mathematical Proof

To explain this phenomenon mathematically, I would like to derive a proof for inverse discrete Fourier transform and the scenario where $r$ divides $N$. For the scenario where $r$ does not divide $N$, I would also try to explain intuitively with mathematics. For discrete Fourier transform, it could be proved using similar mathematical processes.


Proof


The inverse Fourier transform is mathematically defined as

\[\begin{align} f[n] &= \frac{1}{N} \sum_{k = 0}^{N-1} F[k] e^{i \frac{2\pi k n}{N}} \\ \end{align}\]

We defined offset $b$ as the distance between the beginning of the unique sequence and the position of $1$ in the unique sequence. Then we must have

\[f[b] = f[b + kr]\]

where $k$ is any non-negative integer, as long as $b + kr < N$.


Assuming $r$ divides $N$, the inverse discrete Fourier transform for $0/1$ periodic sequence becomes

\[\begin{align} f[n] &= \frac{1}{N} \sum_{k = 0}^{N-1} F[k] e^{i \frac{2\pi k n}{N}} \\ &= \frac{1}{N} \bigg( \underbrace{ F[b] e^{i \frac{2\pi b n}{N}} + F[b + r] e^{i \frac{2\pi (b + r) n}{N}} + \cdots + F[b + (\frac{N}{r} - 1) r] e^{i \frac{2\pi (b + (\frac{N}{r} - 1) r) n}{N}} }_{\frac{N}{r}} \bigg) \\ &= \frac{1}{N} \big( \underbrace{ F[b] e^{i \frac{2\pi b n}{N}} + F[b] e^{i \frac{2\pi (b + r) n}{N}} + \cdots + F[b] e^{i \frac{2\pi (b + (\frac{N}{r} - 1) r) n}{N}} }_{\frac{N}{r}} \big) \\ &= \frac{1}{N} F[b] \big( \underbrace{ e^{i \frac{2\pi b n}{N}} + e^{i \frac{2\pi (b + r) n}{N}} + \cdots + e^{i \frac{2\pi (b + (\frac{N}{r} - 1) r) n}{N}} }_{\frac{N}{r}} \big) \\ &= \frac{1}{N} F[b] e^{i \frac{2\pi b n}{N}} \big( \underbrace{ 1 + e^{i \frac{2\pi r n}{N}} + \cdots + e^{i \frac{2\pi (\frac{N}{r} - 1) r n}{N}} }_{\frac{N}{r}} \big) \\ \end{align}\]

Note that $\{1, e^{i \frac{2\pi r n}{N}}, \cdots, e^{i \frac{2\pi (\frac{N}{r} - 1) r n}{N}} \}$ is a geometric sequence. We would have to compute the sum of the geometric sequence.

\[\begin{align} \underbrace{ 1 + e^{i \frac{2\pi r n}{N}} + \cdots + e^{i \frac{2\pi (\frac{N}{r} - 1) r n}{N}} }_{\frac{N}{r}} &= \begin{cases} \frac{N}{r} & \text{if $e^{i \frac{2\pi r n}{N}} = 1$}\\ \frac{1 - \big(e^{i \frac{2\pi r n}{N}}\big)^{\frac{N}{r}} }{1 - e^{i \frac{2\pi r n}{N}}} & \text{else}\\ \end{cases} \\ &= \begin{cases} \frac{N}{r} & \text{if $e^{i \frac{2\pi r n}{N}} = 1$}\\ \frac{1 - e^{i 2\pi n} }{1 - e^{i \frac{2\pi r n}{N}}} & \text{else}\\ \end{cases} \\ &= \begin{cases} \frac{N}{r} & \text{if $e^{i \frac{2\pi r n}{N}} = 1$}\\ \frac{1 - 1 }{1 - e^{i \frac{2\pi r n}{N}}} & \text{else}\\ \end{cases} \\ &= \begin{cases} \frac{N}{r} & \text{if $e^{i \frac{2\pi r n}{N}} = 1$}\\ 0 & \text{else}\\ \end{cases} \\ \end{align}\]

$e^{i \frac{2\pi r n}{N}} = 1$ if and only if

\[n = k \frac{N}{r}\]

where $k = 0, 1, \cdots, r-1$. So we have

\[\begin{align} f[n] &= \frac{1}{N} F[b] e^{i \frac{2\pi b n}{N}} \big( \underbrace{ 1 + e^{i \frac{2\pi r n}{N}} + \cdots + e^{i \frac{2\pi (\frac{N}{r} - 1) r n}{N}} }_{\frac{N}{r}} \big) \\ &= \begin{cases} \frac{1}{N} F[b] e^{i \frac{2\pi b n}{N}} \frac{N}{r} & \text{if $n = k \frac{N}{r}$ and $k = 0, 1, \cdots, r-1$}\\ 0 & \text{else}\\ \end{cases} \\ &= \begin{cases} \frac{1}{r} F[b] e^{i \frac{2\pi b n}{N}} & \text{if $n = k \frac{N}{r}$ and $k = 0, 1, \cdots, r-1$}\\ 0 & \text{else}\\ \end{cases} \\ \end{align}\]

The magnitude of the $f$ is

\[\begin{align} \rho_{f}[n] &= \begin{cases} \frac{1}{r} F[b] & \text{if $n = k \frac{N}{r}$ and $k = 0, 1, \cdots, r-1$}\\ 0 & \text{else}\\ \end{cases} \\ \end{align}\]

and the normalized probability for $f$ is

\[\begin{align} p_f[n] &= \begin{cases} \frac{1}{r} & \text{if $n = k \frac{N}{r}$ and $k = 0, 1, \cdots, r-1$}\\ 0 & \text{else}\\ \end{cases} \\ \end{align}\]

So in the normalized probability sequence, there are only $r$ values in the normalized probability sequence whose values are not $0$, and the period is exactly $\frac{N}{r}$.


This concludes the proof for the scenario where $r$ divides $N$.


What about the scenario where $r$ does not divide $N$? When $r$ does not divide $N$, we could have two possible expressions of $f$, depending on if the last $1$ is “truncated” or nor.

\[\begin{align} f[n] &= \frac{1}{N} F[b] e^{i \frac{2\pi b n}{N}} \big( \underbrace{ 1 + e^{i \frac{2\pi r n}{N}} + \cdots + e^{i \frac{2\pi \big\lfloor \frac{N}{r} \big\rfloor r n}{N}} }_{\big\lfloor \frac{N}{r} \big\rfloor + 1 } \big) \\ \end{align}\]

or

\[\begin{align} f[n] &= \frac{1}{N} F[b] e^{i \frac{2\pi b n}{N}} \big( \underbrace{ 1 + e^{i \frac{2\pi r n}{N}} + \cdots + e^{i \frac{2\pi \big( \big\lfloor \frac{N}{r} \big\rfloor - 1 \big) r n}{N}} }_{\big\lfloor \frac{N}{r} \big\rfloor} \big) \\ \end{align}\]

Let’s further investigate the sum of the geometric sequence.

\[\begin{align} \underbrace{ 1 + e^{i \frac{2\pi r n}{N}} + \cdots + e^{i \frac{2\pi \big\lfloor \frac{N}{r} \big\rfloor r n}{N}} }_{\big\lfloor \frac{N}{r} \big\rfloor + 1 } &= \begin{cases} \frac{N}{r} & \text{if $e^{i \frac{2\pi r n}{N}} = 1$}\\ \frac{1 - \big(e^{i \frac{2\pi r n}{N}}\big)^{\big\lfloor \frac{N}{r} \big\rfloor + 1} }{1 - e^{i \frac{2\pi r n}{N}}} & \text{else}\\ \end{cases} \\ \end{align}\] \[\begin{align} \underbrace{ 1 + e^{i \frac{2\pi r n}{N}} + \cdots + e^{i \frac{2\pi \big( \big\lfloor \frac{N}{r} \big\rfloor - 1 \big) r n}{N}} }_{\big\lfloor \frac{N}{r} \big\rfloor} &= \begin{cases} \frac{N}{r} & \text{if $e^{i \frac{2\pi r n}{N}} = 1$}\\ \frac{1 - \big(e^{i \frac{2\pi r n}{N}}\big)^{\big\lfloor \frac{N}{r} \big\rfloor} }{1 - e^{i \frac{2\pi r n}{N}}} & \text{else}\\ \end{cases} \\ \end{align}\]

Because $r$ does not divide $N$, $e^{i \frac{2\pi r n}{N}} = 1$ could never be satisfied. So we have

\[\begin{align} \underbrace{ 1 + e^{i \frac{2\pi r n}{N}} + \cdots + e^{i \frac{2\pi \big\lfloor \frac{N}{r} \big\rfloor r n}{N}} }_{\big\lfloor \frac{N}{r} \big\rfloor + 1 } &= \frac{1 - \big(e^{i \frac{2\pi r n}{N}}\big)^{\big\lfloor \frac{N}{r} \big\rfloor + 1} }{1 - e^{i \frac{2\pi r n}{N}}} \\ \end{align}\] \[\begin{align} \underbrace{ 1 + e^{i \frac{2\pi r n}{N}} + \cdots + e^{i \frac{2\pi \big( \big\lfloor \frac{N}{r} \big\rfloor - 1 \big) r n}{N}} }_{\big\lfloor \frac{N}{r} \big\rfloor} &= \frac{1 - \big(e^{i \frac{2\pi r n}{N}}\big)^{\big\lfloor \frac{N}{r} \big\rfloor} }{1 - e^{i \frac{2\pi r n}{N}}} \\ \end{align}\]

Therefore,

\[\begin{align} f[n] &= \frac{1}{N} F[b] e^{i \frac{2\pi b n}{N}} \frac{1 - \big(e^{i \frac{2\pi r n}{N}}\big)^{\big\lfloor \frac{N}{r} \big\rfloor + 1} }{1 - e^{i \frac{2\pi r n}{N}}} \\ \end{align}\]

or

\[\begin{align} f[n] &= \frac{1}{N} F[b] e^{i \frac{2\pi b n}{N}} \frac{1 - \big(e^{i \frac{2\pi r n}{N}}\big)^{\big\lfloor \frac{N}{r} \big\rfloor} }{1 - e^{i \frac{2\pi r n}{N}}} \\ \end{align}\]

Let’s first check what the magnitude of $1 - e^{i \theta}$ is.

\[\begin{align} |1 - e^{i \theta}| &= | 1 - \cos\theta - i\sin\theta | \\ &= \sqrt{(1 - \cos\theta)^2 + (-\sin\theta)^2} \\ &= \sqrt{2 - 2 \cos\theta} \\ \end{align}\]

So we have

\[\begin{align} \rho[n] &= \frac{1}{N} F[b] \frac{\sqrt{2 - 2 \cos \big( \frac{2\pi r n}{N} \big\lfloor \frac{N}{r} \big\rfloor} \big) }{\sqrt{2 - 2 \cos \frac{2\pi r n}{N} }} \\ &= \frac{1}{N} F[b] \frac{\sqrt{1 - \cos \big( \frac{2\pi r n}{N} \big\lfloor \frac{N}{r} \big\rfloor} \big) }{\sqrt{1 - \cos \frac{2\pi r n}{N} }} \\ &= \frac{1}{N} F[b] \sqrt{ \frac{1 - \cos \big( \frac{2\pi r n}{N} \big\lfloor \frac{N}{r} \big\rfloor \big) }{1 - \cos \frac{2\pi r n}{N} } } \\ \end{align}\]

or

\[\begin{align} \rho[n] &= \frac{1}{N} F[b] \sqrt{ \frac{1 - \cos \big( \frac{2\pi r n}{N} \big( \big\lfloor \frac{N}{r} \big\rfloor + 1 \big) \big) }{1 - \cos \frac{2\pi r n}{N} } } \\ \end{align}\]

Because

\[\begin{align} 1 - \cos \bigg( \frac{2\pi r n}{N} \big\lfloor \frac{N}{r} \big\rfloor \bigg) &\approx 1 - \cos \bigg( \frac{2\pi r n}{N} \frac{N}{r} \bigg) \\ &= 1 - \cos \big( 2\pi n \big) \\ &= 1 - 1 \\ &= 0 \\ \end{align}\]

So $1 - \cos \big( \frac{2\pi r n}{N} \big\lfloor \frac{N}{r} \big\rfloor \big)$ must be a very small number close to $0$ no matter how $n$ changes. The same is also true for $1 - \cos \big( \frac{2\pi r n}{N} \big( \big\lfloor \frac{N}{r} \big\rfloor + 1 \big) \big)$.


Therefore, when $1 - \cos \frac{2\pi r n}{N}$ is the smallest, $\rho[n]$ is the largest. In addition, if $1 - \cos \frac{2\pi r n}{N}$ deviates from $0$, $\rho[n]$ would immediately become a value close to $0$.


So $1 - \cos \frac{2\pi r n}{N}$ is the smallest if and only if

\[n = k \big\lfloor \frac{N}{r} \big\rfloor\]

or

\[n = k \bigg( \big\lfloor \frac{N}{r} \big\rfloor + 1\bigg)\]

where $k$ could be any non-negative integer and $0 \leq n < N$.


Therefore, even if $r$ does not divide $N$, the behavior of the magnitude and normalized probabilities of the transformed sequence is similar to the ones from the scenario where $r$ divides $N$.

Conclusions

Using the effect of discrete Fourier transform or inverse discrete Fourier transform on $0/1$ periodic sequence, we could transform a high frequency $0/1$ periodic sequence to a low frequency sequence.


In fact, the periodic sequence does not have to be $0/1$ periodic sequence. It could be any $0/v$ periodic sequence, as long as $v$ is a constant value. Discrete Fourier transform or inverse discrete Fourier transform would have the same effect. To prove this, simply plugging a constant value $v$ into the proof for the $0/1$ periodic sequence and the proof is still valid.

Final Remarks

This is actually a key part to understand Shor’s algorithm. But I have never seen any formal textbooks or tutorials discussing this phenomenon. It took me more than a lot of time to gradually understand this phenomenon and derive the proof from scratch.