Lei Mao bio photo

Lei Mao

Machine Learning, Artificial Intelligence, Computer Science.

Twitter Facebook LinkedIn GitHub   G. Scholar E-Mail RSS

Introduction

In the previous blog post, I discussed the Deutsch’s algorithm which determines whether the function $f(x)$ is balanced or constant by running $f(x)$ once using quantum circuits, where $f(x)$ maps from the set $\{0,1\}$ to the set $\{0,1\}$. However, using classical circuits, we are able to determine whether the function $f(x)$ is balanced or constant by running $f(x)$ twice. It might seem to some people that the improvement is not significant for this particular simple problem.


Let’s change the problem setting to a similar but more sophisticated one. Instead of taking one single bit as input, $f(\mathbf{x})$ now takes $n$ bits as input. Now, $f(\mathbf{x})$ maps from the set $\{0,1\}^n$ to the set $\{0,1\}$. The number of different inputs to $f(\mathbf{x})$ is $2^n$. We call the the function $f(\mathbf{x})$ balanced if $f(\mathbf{x}) = 0$ for exactly $2^{n-1}$ of all the different inputs, and $f(\mathbf{x}) = 1$ for the rest of the inputs. We call the function $f(\mathbf{x})$ constant if $f(\mathbf{x}) = 0$ for all different inputs, or $f(\mathbf{x}) = 1$ for all different inputs. $f(\mathbf{x})$ could be neither balanced or constant, but in this particular problem, we assume the $f(\mathbf{x})$ we are investigating must be either balanced or constant. It should be noted that it is a more general case for the problem that the Deutsch algorithm was trying to solve. When $n=1$, it decays to the problem that the Deutsch algorithm was trying to solve.


With classical circuits, what is the minimum and maximum number of runs to determine whether $f(\mathbf{x})$ we are investigating must be either balanced or constant? We have $2^n$ different inputs to test for this binary classification problem. In the worst scenario, we could run into the situation that the first $2^{n-1}$ inputs we tested produce all $0$s or all $1$s. Then the $2^{n-1} + 1$th input we tested must tell us whether $f(\mathbf{x})$ could be neither balanced or constant. For example, if the first $2^{n-1}$ inputs we tested produce all $0$s, if the $2^{n-1} + 1$th input we tested produces $0$, then $f(\mathbf{x})$ is constant; if the the $2^{n-1} + 1$th input we tested produces $1$, then $f(\mathbf{x})$ is balanced. In the best scenario, we got both $0$ and $1$ in the first two runs, we immediately know $f(\mathbf{x})$ is balanced.


With quantum circuits, can we do better and how much can we do better? In fact, there is a Deutsch-Jozsa algorithm, a derivative of the Deutsch algorithm, which solve the problem using exactly one run. If you compared to the worst case using classical circuits, it is $2^{n-1} + 1$ times faster asymptotically, which is significant when $n$ is large.


In this blog post, I would like to discuss the Deutsch-Jozsa algorithm. It would be recommended if the reader could read the blog on Deutsch algorithm first, although this blog post should be mostly self-contained.

Prerequisites

Reducing Sum or Difference to Boolean

If $x$ and $y$ are binary values, $x, y \in \{0, 1\}$, we have

\[\begin{align} (-1)^{x + y} &= (-1)^{x \oplus y} \\ (-1)^{x - y} &= (-1)^{x \oplus y} \\ \end{align}\]

where $\oplus$ is $\text{XOR}$ (binary addition modulo 2). This could be easily verified using truth table.

Distributivity of XOR

If $x$, $y$, and $z$ are binary values, $x, y, z \in \{0, 1\}$, we have

\[\begin{align} (x \oplus y) \wedge z &= ( x \wedge z ) \oplus ( y \wedge z ) \\ \end{align}\]

I did not find a proof from the Internet so I derived a proof here. The proof is a little bit complicated and requires to use boolean algebra.


It is easy to verify using truth table that

\[\begin{align} x \oplus y &= (x \vee y) \wedge \neg(x \wedge y) \\ &= (x \wedge \neg y) \vee (\neg x \wedge y) \\ \end{align}\]

Using the basic boolean algebra, we have

\[\begin{align} (x \oplus y) \wedge z &= \big( (x \wedge \neg y) \vee (\neg x \wedge y) \big) \wedge z \\ &= \big( (x \wedge \neg y) \wedge z \big) \vee\big( (\neg x \wedge y) \wedge z \big) \\ &= ( x \wedge \neg y \wedge z ) \vee ( \neg x \wedge y \wedge z ) \\ \end{align}\] \[\begin{align} ( x \wedge z ) \oplus ( y \wedge z ) &= \big( ( x \wedge z ) \wedge \neg ( y \wedge z ) \big) \vee \big( \neg ( x \wedge z ) \wedge ( y \wedge z ) \big) \\ &= \big( ( x \wedge z ) \wedge ( \neg y \vee \neg z ) \big) \vee \big( ( \neg x \vee \neg z ) \wedge ( y \wedge z ) \big) \\ &= \Big( \big( ( x \wedge z ) \wedge \neg y \big) \vee \big( ( x \wedge z ) \wedge \neg z \big) \Big) \vee \Big( \big( \neg x \wedge ( y \wedge z ) \big) \vee \big( \neg z \wedge ( y \wedge z ) \big) \Big) \\ &= \Big( \big( ( x \wedge z ) \wedge \neg y \big) \vee 0 \Big) \vee \Big( \big( \neg x \wedge ( y \wedge z ) \big) \vee 0 \Big) \\ &= \big( ( x \wedge z ) \wedge \neg y \big) \vee \big( \neg x \wedge ( y \wedge z ) \big) \\ &= ( x \wedge z \wedge \neg y ) \vee ( \neg x \wedge y \wedge z ) \\ &= ( x \wedge \neg y \wedge z ) \vee ( \neg x \wedge y \wedge z ) \\ \end{align}\]

Therefore,

\[\begin{align} (x \oplus y) \wedge z &= ( x \wedge z ) \oplus ( y \wedge z ) \\ \end{align}\]

This concludes the proof.

Inner Product and Inner Product Space for Binary Vector Space

In the previous blog post, we have defined the inner product and inner product space for complex vector space. Similarly, we could also define the inner product and inner product space for binary vector space.

\[\begin{align} \langle -, - \rangle : \{0,1\}^n \times \{0,1\}^n \rightarrow \{0,1\} \end{align}\]

Given two binary vectors $\mathbf{x}, \mathbf{y} \in \{0,1\}^n$, $\mathbf{x} = \{x_0, x_1, \cdots, x_{n-1}\}$ and $\mathbf{y} = \{y_0, y_1, \cdots, y_{n-1}\}$, the inner product of $\mathbf{x}$ and $\mathbf{y}$ is defined as

\[\begin{align} \langle \mathbf{x}, \mathbf{y} \rangle &= (x_0 \wedge y_0) \oplus (x_1 \wedge y_1) \oplus \cdots \oplus (x_{n-1} \wedge y_{n-1}) \\ &= \bigoplus_{i=0}^{n-1} (x_i \wedge y_i ) \end{align}\]

which is somewhat similar to the inner product definition for real vector space.


The bitwise exclusive-or operation $\oplus$ was also defined for binary vectors $\mathbf{x}$ and $\mathbf{y}$ of the same length. Given two binary vectors $\mathbf{x}, \mathbf{y} \in \{0,1\}^n$, $\mathbf{x} = \{x_0, x_1, \cdots, x_{n-1}\}$ and $\mathbf{y} = \{y_0, y_1, \cdots, y_{n-1}\}$,

\[\begin{align} \mathbf{x} \oplus \mathbf{y} = \{x_0 \oplus y_0, x_1 \oplus y_1, \cdots, x_{n-1} \oplus y_{n-1}\} \end{align}\]

The following inner product properties are satisfied based on the above inner product definition.


Given $\mathbf{x}, \mathbf{x}^{\prime}, \mathbf{y}, \mathbf{y}^{\prime} \in \{0,1\}^n$, using the $\text{XOR}$ distributivity property we derived above,

\[\begin{align} \langle \mathbf{x} \oplus \mathbf{x}^{\prime}, \mathbf{y} \rangle &= \big((x_0 \oplus x_0^{\prime})\wedge y_0\big) \oplus \big((x_1 \oplus x_1^{\prime}) \wedge y_1\big) \oplus \cdots \oplus \big((x_{n-1} \oplus x_{n-1}^{\prime}) \wedge y_{n-1}\big) \\ &= \big((x_0 \wedge y_0) \oplus (x_0^{\prime} \wedge y_0) \big) \oplus \big((x_1 \wedge y_1) \oplus (x_1^{\prime} \wedge y_1) \big) \oplus \cdots \oplus \big((x_{n-1} \wedge y_{n-1}) \oplus (x_{n-1}^{\prime} \wedge y_{n-1}) \big) \\ &= (x_0 \wedge y_0) \oplus (x_0^{\prime} \wedge y_0) \oplus (x_1 \wedge y_1) \oplus (x_1^{\prime} \wedge y_1) \oplus \cdots \oplus (x_{n-1} \wedge y_{n-1}) \oplus (x_{n-1}^{\prime} \wedge y_{n-1}) \\ &= \big( (x_0 \wedge y_0) \oplus (x_1 \wedge y_1) \oplus \cdots \oplus (x_{n-1} \wedge y_{n-1}) \big) \oplus \big( (x_0^{\prime} \wedge y_0) \oplus (x_1^{\prime} \wedge y_1) \oplus \cdots \oplus (x_{n-1}^{\prime} \wedge y_{n-1}) \big) \\ &= \langle \mathbf{x}, \mathbf{y} \rangle \oplus \langle \mathbf{x}^{\prime}, \mathbf{y} \rangle \\ \end{align}\]

Similarly,

\[\begin{align} \langle \mathbf{x}, \mathbf{y} \oplus \mathbf{y}^{\prime} \rangle = \langle \mathbf{x}, \mathbf{y} \rangle \oplus \langle \mathbf{x}, \mathbf{y}^{\prime} \rangle \\ \end{align}\]

Let $\mathbf{0} = \{ \underbrace{0, 0, \cdots, 0}_{n} \} = 0^n$, we have

\[\begin{align} \langle \mathbf{0}, \mathbf{y} \rangle = 0 \\ \langle \mathbf{x}, \mathbf{0} \rangle = 0 \\ \end{align}\]

Hadamard Operator

Hardmard operator is a special quantum operator, which could be represented using the following unitary matrix.

\[\begin{align} H &= \begin{bmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} \\ \end{bmatrix} \\ &= \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1 \\ 1 & -1 \\ \end{bmatrix} \end{align}\]

It is easy to see that

\[\begin{align} H|0\rangle &= \frac{|0\rangle + |1\rangle}{\sqrt{2}} \\ H|1\rangle &= \frac{|0\rangle - |1\rangle}{\sqrt{2}} \\ \end{align}\] \[\begin{align} H\frac{|0\rangle + |1\rangle}{\sqrt{2}} &= |0\rangle \\ H\frac{|0\rangle - |1\rangle}{\sqrt{2}} &= |1\rangle \\ \end{align}\]

We denote

\[\underbrace{H \otimes H \otimes \cdots \otimes H}_{n} = H^{\otimes n}\]

It is easy to see that $H$ is unitary, i.e., $H H^{\dagger} = H^{\dagger} H = I$ and $H^{\dagger} = H$. In fact, $H^{\otimes n}$ is unitary, and $(H^{\otimes n})^{-1} = H^{\otimes n}$.


For Kronecker product, the $A \otimes B$ is invertible if and only if both $A$ and $B$ are invertible and

\[\begin{align} (A \otimes B)^{-1} = A^{-1} \otimes B^{-1} \end{align}\]

It is easy to verify this using the Kronecker product mixed-product property.


Therefore,

\[\begin{align} (H^{\otimes n})^{-1} &= (H \otimes H^{\otimes {n-1}})^{-1} \\ &= H^{-1} \otimes (H^{\otimes {n-1}})^{-1} \\ &= H \otimes (H^{\otimes {n-1}})^{-1} \\ &= \cdots \\ &= H^{\otimes n} \end{align}\]

The values in the matrix $H^{\otimes n}$ follows specific patterns. Specifically,

\[H^{\otimes n}[\mathbf{i}, \mathbf{j}] = \frac{1}{\sqrt{2^n}} (-1)^{\langle \mathbf{i}, \mathbf{j} \rangle}\]

where $\mathbf{i}$ is the binary vector representation for row number $i$, and $\mathbf{j}$ is the binary vector representation for column number $j$.


Since I did not find any proof from the Internet, I would like to derive a simple proof using mathematical induction.


For the base case for $n = 1$, it is easy to verify that

\[H^{\otimes 1}[\mathbf{i}, \mathbf{j}] = \frac{1}{\sqrt{2^1}} (-1)^{\langle \mathbf{i}, \mathbf{j} \rangle}\]

where $i, j \in [0, 2^1)$, i.e., $\mathbf{i}, \mathbf{j} \in \{0, 1\}^{1}$.


Assuming for $n = k - 1$,

\[H^{\otimes {k-1}}[\mathbf{i}, \mathbf{j}] = \frac{1}{\sqrt{2^{k-1}}} (-1)^{\langle \mathbf{i}, \mathbf{j} \rangle}\]

where $i, j \in [0, 2^{k-1})$, i.e., $\mathbf{i}, \mathbf{j} \in \{0, 1\}^{k - 1}$.


For $n = k$, we have

\[\begin{align} H^{\otimes k} &= H \otimes H^{\otimes {k-1}} \\ &= \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1 \\ 1 & -1 \\ \end{bmatrix} \otimes H^{\otimes {k-1}} \\ &= \frac{1}{\sqrt{2}} \begin{bmatrix} H^{\otimes {k-1}} & H^{\otimes {k-1}} \\ H^{\otimes {k-1}} & -H^{\otimes {k-1}} \\ \end{bmatrix} \\ \end{align}\]

Note that $H^{\otimes {k-1}}$ is a $2^{k-1}$ by $2^{k-1}$ matrix.


For $i \in [0, 2^{k-1})$ and $j \in [0, 2^{k-1})$,

\[\begin{align} H^{\otimes {k}}[\mathbf{i}, \mathbf{j}] &= \frac{1}{\sqrt{2}} H^{\otimes {k-1}}[\mathbf{i}^{\prime}, \mathbf{j}^{\prime}] \\ &= \frac{1}{\sqrt{2^{k}}} (-1)^{\langle \mathbf{i}^{\prime}, \mathbf{j}^{\prime} \rangle} \end{align}\]

where $\mathbf{i}^{\prime} = \mathbf{i}_{1:}$, and $\mathbf{j}^{\prime} = \mathbf{j}_{1:}$, $i_0 = 0$, and $j_0 = 0$.

\[\begin{align} \langle \mathbf{i}, \mathbf{j} \rangle &= (i_0 \wedge j_0) \oplus (i_1 \wedge j_1) \oplus \cdots \oplus (i_{n-1} \wedge j_{n-1}) \\ &= (0 \wedge 0) \oplus (i_1 \wedge j_1) \oplus \cdots \oplus (i_{n-1} \wedge j_{n-1}) \\ &= 0 \oplus (i_1 \wedge j_1) \oplus \cdots \oplus (i_{n-1} \wedge j_{n-1}) \\ &= (i_1 \wedge j_1) \oplus \cdots \oplus (i_{n-1} \wedge j_{n-1}) \\ &= \langle \mathbf{i}^{\prime}, \mathbf{j}^{\prime} \rangle \end{align}\]

Therefore,

\[H^{\otimes {k}}[\mathbf{i}, \mathbf{j}] = \frac{1}{\sqrt{2^{k}}} (-1)^{\langle \mathbf{i}, \mathbf{j} \rangle}\]

where $i \in [0, 2^{k-1})$ and $j \in [0, 2^{k-1})$.


For $i \in [2^{k-1}, 2^{k})$ and $j \in [0, 2^{k-1})$,

\[\begin{align} H^{\otimes {k}}[\mathbf{i}, \mathbf{j}] &= \frac{1}{\sqrt{2}} H^{\otimes {k-1}}[\mathbf{i}^{\prime}, \mathbf{j}^{\prime}] \\ &= \frac{1}{\sqrt{2^{k}}} (-1)^{\langle \mathbf{i}^{\prime}, \mathbf{j}^{\prime} \rangle} \end{align}\]

where $\mathbf{i}^{\prime} = \mathbf{i}_{1:}$, and $\mathbf{j}^{\prime} = \mathbf{j}_{1:}$, $i_0 = 1$, and $j_0 = 0$.

\[\begin{align} \langle \mathbf{i}, \mathbf{j} \rangle &= (i_0 \wedge j_0) \oplus (i_1 \wedge j_1) \oplus \cdots \oplus (i_{n-1} \wedge j_{n-1}) \\ &= (1 \wedge 0) \oplus (i_1 \wedge j_1) \oplus \cdots \oplus (i_{n-1} \wedge j_{n-1}) \\ &= 0 \oplus (i_1 \wedge j_1) \oplus \cdots \oplus (i_{n-1} \wedge j_{n-1}) \\ &= (i_1 \wedge j_1) \oplus \cdots \oplus (i_{n-1} \wedge j_{n-1}) \\ &= \langle \mathbf{i}^{\prime}, \mathbf{j}^{\prime} \rangle \end{align}\]

Therefore,

\[H^{\otimes {k}}[\mathbf{i}, \mathbf{j}] = \frac{1}{\sqrt{2^{k}}} (-1)^{\langle \mathbf{i}, \mathbf{j} \rangle}\]

where $i \in [2^{k-1}, 2^{k})$ and $j \in [0, 2^{k-1})$.


For $i \in [0, 2^{k-1})$ and $j \in [2^{k-1}, 2^{k})$,

\[\begin{align} H^{\otimes {k}}[\mathbf{i}, \mathbf{j}] &= \frac{1}{\sqrt{2}} H^{\otimes {k-1}}[\mathbf{i}^{\prime}, \mathbf{j}^{\prime}] \\ &= \frac{1}{\sqrt{2^{k}}} (-1)^{\langle \mathbf{i}^{\prime}, \mathbf{j}^{\prime} \rangle} \end{align}\]

where $\mathbf{i}^{\prime} = \mathbf{i}_{1:}$, and $\mathbf{j}^{\prime} = \mathbf{j}_{1:}$, $i_0 = 0$, and $j_0 = 1$.

\[\begin{align} \langle \mathbf{i}, \mathbf{j} \rangle &= (i_0 \wedge j_0) \oplus (i_1 \wedge j_1) \oplus \cdots \oplus (i_{n-1} \wedge j_{n-1}) \\ &= (0 \wedge 1) \oplus (i_1 \wedge j_1) \oplus \cdots \oplus (i_{n-1} \wedge j_{n-1}) \\ &= 0 \oplus (i_1 \wedge j_1) \oplus \cdots \oplus (i_{n-1} \wedge j_{n-1}) \\ &= (i_1 \wedge j_1) \oplus \cdots \oplus (i_{n-1} \wedge j_{n-1}) \\ &= \langle \mathbf{i}^{\prime}, \mathbf{j}^{\prime} \rangle \end{align}\]

Therefore,

\[H^{\otimes {k}}[\mathbf{i}, \mathbf{j}] = \frac{1}{\sqrt{2^{k}}} (-1)^{\langle \mathbf{i}, \mathbf{j} \rangle}\]

where $i \in [0, 2^{k-1})$ and $j \in [2^{k-1}, 2^{k})$.


For $i \in [2^{k-1}, 2^{k})$ and $j \in [2^{k-1}, 2^{k})$,

\[\begin{align} H^{\otimes {k}}[\mathbf{i}, \mathbf{j}] &= - \frac{1}{\sqrt{2}} H^{\otimes {k-1}}[\mathbf{i}^{\prime}, \mathbf{j}^{\prime}] \\ &= -\frac{1}{\sqrt{2^{k}}} (-1)^{\langle \mathbf{i}^{\prime}, \mathbf{j}^{\prime} \rangle} \end{align}\]

where $\mathbf{i}^{\prime} = \mathbf{i}_{1:}$, and $\mathbf{j}^{\prime} = \mathbf{j}_{1:}$, $i_0 = 1$, and $j_0 = 1$.

\[\begin{align} \langle \mathbf{i}, \mathbf{j} \rangle &= (i_0 \wedge j_0) \oplus (i_1 \wedge j_1) \oplus \cdots \oplus (i_{n-1} \wedge j_{n-1}) \\ &= (1 \wedge 1) \oplus (i_1 \wedge j_1) \oplus \cdots \oplus (i_{n-1} \wedge j_{n-1}) \\ &= 1 \oplus (i_1 \wedge j_1) \oplus \cdots \oplus (i_{n-1} \wedge j_{n-1}) \\ &= 1 \oplus \langle \mathbf{i}^{\prime}, \mathbf{j}^{\prime} \rangle \\ &= 1 \oplus \langle \mathbf{i}^{\prime}, \mathbf{j}^{\prime} \rangle \\ \end{align}\]

Based on the reducing sum or difference to boolean property we derived above,

\[\begin{align} (-1)^{\langle \mathbf{i}, \mathbf{j} \rangle} &= (-1)^{1 \oplus \langle \mathbf{i}^{\prime}, \mathbf{j}^{\prime} \rangle} \\ &= (-1)^{1 + \langle \mathbf{i}^{\prime}, \mathbf{j}^{\prime} \rangle} \\ &= -(-1)^{\langle \mathbf{i}^{\prime}, \mathbf{j}^{\prime} \rangle} \\ \end{align}\]

Therefore,

\[H^{\otimes {k}}[\mathbf{i}, \mathbf{j}] = \frac{1}{\sqrt{2^{k}}} (-1)^{\langle \mathbf{i}, \mathbf{j} \rangle}\]

where $i \in [2^{k-1}, 2^{k})$ and $j \in [2^{k-1}, 2^{k})$.


Taken together, for $n = k$,

\[H^{\otimes {k}}[\mathbf{i}, \mathbf{j}] = \frac{1}{\sqrt{2^{k}}} (-1)^{\langle \mathbf{i}, \mathbf{j} \rangle}\]

where $i, j \in [0, 2^k)$, i.e., $\mathbf{i}, \mathbf{j} \in \{0, 1\}^{k}$.


This finishes the mathematical induction and concludes the proof.


Note that the entries of the first column and the entries of the first row of $H^{\otimes {n}}$ have exactly the same value, $\frac{1}{\sqrt{2^{n}}}$. This is because

\[\begin{align} H^{\otimes {n}}[\mathbf{0}, \mathbf{j}] &= \frac{1}{\sqrt{2^{n}}} (-1)^{\langle \mathbf{0}, \mathbf{j} \rangle} \\ &= \frac{1}{\sqrt{2^{n}}} (-1)^{\langle \mathbf{0}, \mathbf{j} \rangle} \\ &= \frac{1}{\sqrt{2^{n}}} (-1)^{0} \\ &= \frac{1}{\sqrt{2^{n}}} \\ \end{align}\] \[\begin{align} H^{\otimes {n}}[\mathbf{i}, \mathbf{0}] &= \frac{1}{\sqrt{2^{n}}} (-1)^{\langle \mathbf{i}, \mathbf{0} \rangle} \\ &= \frac{1}{\sqrt{2^{n}}} (-1)^{\langle \mathbf{i}, \mathbf{0} \rangle} \\ &= \frac{1}{\sqrt{2^{n}}} (-1)^{0} \\ &= \frac{1}{\sqrt{2^{n}}} \\ \end{align}\]

Equivalently, we might write, $H^{\otimes {n}}_{0,:} = \frac{1}{\sqrt{2^{n}}} \mathbf{1}$ and $H^{\otimes {n}}_{:,0} = \frac{1}{\sqrt{2^{n}}} \mathbf{1}$.


To extract an arbitrary column $j$ from $H^{\otimes {n}}$, we prepared a one-hot quantum system basic state vector $| \mathbf{y} \rangle = [y_0, y_1, \cdots, y_{2^n-1}]^{\top}$, where $y_j = 1$ and $y_k = 0$ for $k \neq j$.

\[\begin{align} H^{\otimes {n}}_{:,j} &= H^{\otimes {n}} | \mathbf{y} \rangle \\ &= H^{\otimes n}[\mathbf{0}, \mathbf{j}] | \mathbf{x}_0 \rangle + H^{\otimes n}[\mathbf{1}, \mathbf{j}] | \mathbf{x}_1 \rangle + \cdots + H^{\otimes n}[\mathbf{2^n-1}, \mathbf{j}] | \mathbf{x}_{2^{n}-1} \rangle \\ &= \frac{1}{\sqrt{2^n}} (-1)^{\langle \mathbf{0}, \mathbf{j} \rangle} | \mathbf{x}_0 \rangle + \frac{1}{\sqrt{2^n}} (-1)^{\langle \mathbf{1}, \mathbf{j} \rangle} | \mathbf{x}_1 \rangle + \cdots + \frac{1}{\sqrt{2^n}} (-1)^{\langle \mathbf{2^n-1}, \mathbf{j} \rangle} | \mathbf{x}_{2^{n}-1} \rangle \\ &= \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} (-1)^{\langle \mathbf{i}, \mathbf{j} \rangle} | \mathbf{x}_i \rangle\\ \end{align}\]

where $| \mathbf{x}_i \rangle$ is a quantum system one-hot basic state vector, $|\mathbf{x}_i\rangle = [x_0, x_1, \cdots, x_{2^{n}-1}]^{\top}$, where $x_i = 1$ and $x_k = 0$ for $k \neq i$.


Since $H^{\otimes {n}}$ is unitary and $(H^{\otimes {n}})^{-1} = H^{\otimes {n}}$as we have discussed above,

\[\begin{align} H^{\otimes {n}} \Big( \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} (-1)^{\langle \mathbf{i}, \mathbf{j} \rangle} | \mathbf{x}_i \rangle \Big) &= H^{\otimes {n}} H^{\otimes {n}} | \mathbf{y} \rangle \\ &= I | \mathbf{y} \rangle \\ &= | \mathbf{y} \rangle \\ \end{align}\]

For example, if $j = 0$, $| \mathbf{y} \rangle = [\underbrace{1, 0, 0, \cdots, 0}_{2^n} ]^{\top} = | \mathbf{0} \rangle$,

\[\begin{align} H^{\otimes {n}} \Big( \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} | \mathbf{x}_i \rangle \Big) &= H^{\otimes {n}} \Big( \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} (-1)^{0} | \mathbf{x}_i \rangle \Big) \\ &= H^{\otimes {n}} \Big( \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} (-1)^{\langle \mathbf{i}, \mathbf{0} \rangle} | \mathbf{x}_i \rangle \Big) \\ &= H^{\otimes {n}} H^{\otimes {n}} | \mathbf{0} \rangle \\ &= I | \mathbf{0} \rangle \\ &= | \mathbf{0} \rangle \\ \end{align}\]

Deutsch-Jozsa Algorithm

The black-box $f(\mathbf{x})$ is represented using a quantum gate $U_f$. Our job is to determine whether the $f(\mathbf{x})$ corresponding to the $U_f$ is constant or balanced.

$U_f$

The quantum gate $U_f$ is a unitary matrix which maps from $| \mathbf{x} \rangle \otimes | y \rangle$ to $| \mathbf{x} \rangle \otimes | f(\mathbf{x}) \oplus y \rangle$, namely $U_f (| \mathbf{x} \rangle \otimes | y \rangle) = | \mathbf{x} \rangle \otimes | f(\mathbf{x}) \oplus y \rangle$, for $\mathbf{x} \in \{0, 1\}^n$ and $y \in \{0, 1\}$. When $y = 0$, $| f(\mathbf{x}) \oplus y \rangle = | f(\mathbf{x}) \oplus 0 \rangle = | f(\mathbf{x}) \rangle $, $| y \oplus f(\mathbf{x}) \rangle$ is just $| f(\mathbf{x}) \rangle$.


Note that the above mapping is not necessarily valid when $| \mathbf{x} \rangle$ and $| y \rangle$ are superpositions.


Let’s further check when $| \mathbf{x} \rangle$ and $| y \rangle$ are superpositions, what the outputs from $U_f$ will be. Perhaps we could achieve fewer runs with superpositions.

First Attempt

In the first attempt, we made the second input to $U_f$ a superposition, which is achieved by applying a Hadamard operator to the second input $|1\rangle$. This is almost the same as what we did in the second attempt in the Deutsch algorithm. The superposition is

\[\begin{align} H|1\rangle &= \frac{|0\rangle - |1\rangle}{\sqrt{2}} \\ \end{align}\]
First Attempt
\[\begin{align} |\varphi_0\rangle &= |\mathbf{x}\rangle \otimes |1\rangle\\ &= |\mathbf{x}, 1\rangle \\ \end{align}\] \[\begin{align} |\varphi_1\rangle &= (I \otimes H) |\varphi_0\rangle \\ &= (I \otimes H) (|\mathbf{x}\rangle \otimes |1\rangle) \\ &= I|\mathbf{x}\rangle \otimes H |1\rangle \\ &= |\mathbf{x}\rangle \otimes \frac{|0\rangle - |1\rangle}{\sqrt{2}} \\ &= \frac{|\mathbf{x}, 0\rangle - |\mathbf{x}, 1\rangle}{\sqrt{2}} \\ \end{align}\] \[\begin{align} |\varphi_2\rangle &= U_f |\varphi_1\rangle \\ &= U_f \frac{|\mathbf{x}, 0\rangle - |\mathbf{x}, 1\rangle}{\sqrt{2}} \\ &= \frac{U_f |\mathbf{x}, 0\rangle - U_f |\mathbf{x}, 1\rangle}{\sqrt{2}} \\ &= \frac{U_f (|\mathbf{x}\rangle \otimes |0\rangle) - U_f (|\mathbf{x}\rangle \otimes |1\rangle)}{\sqrt{2}} \\ &= \frac{|\mathbf{x}\rangle \otimes |0 \oplus f(x)\rangle - |\mathbf{x}\rangle \otimes |1 \oplus f(\mathbf{x})\rangle}{\sqrt{2}} \\ &= \frac{|\mathbf{x}\rangle \otimes |f(\mathbf{x})\rangle - |\mathbf{x}\rangle \otimes |\overline{f(\mathbf{x})}\rangle}{\sqrt{2}} \\ &= |\mathbf{x}\rangle \otimes \frac{|f(\mathbf{x})\rangle - |\overline{f(\mathbf{x})}\rangle}{\sqrt{2}} \\ \end{align}\]

Because $\frac{|f(\mathbf{x})\rangle - |\overline{f(\mathbf{x})}\rangle}{\sqrt{2}}$ is either $\frac{|0\rangle - |1\rangle}{\sqrt{2}}$ or $\frac{|1\rangle - |0\rangle}{\sqrt{2}}$, $|\varphi_2\rangle $ could be further simplified as

\[\begin{align} |\varphi_2\rangle &= (-1)^{f(\mathbf{x})} |\mathbf{x}\rangle \otimes \frac{|0\rangle - |1\rangle}{\sqrt{2}} \\ \end{align}\]

The observation from the first output is always the same to the first input, whereas there are 50-50 percent chance observing 0 or 1 for the second input, which is a single qubit.


This provides no information of determining whether $f(\mathbf{x})$ is constant or balanced either.

Second Attempt

In the second attempt, we made the both inputs to $U_f$ superpositions, which is achieved by applying $H^{\otimes n}$ to the first input $|\mathbf{0}\rangle = |0^n\rangle$ and applying $H$ to the second input $|1\rangle$. The superpositions for the first input and the second input are $H^{\otimes n}|\mathbf{0}\rangle$ and $H|1\rangle$, respectively.

\[\begin{align} H|1\rangle &= \frac{|0\rangle - |1\rangle}{\sqrt{2}} \\ \end{align}\]

$H^{\otimes n}|\mathbf{0}\rangle$ is actually just the first column of $H^{\otimes n}$, because $|\mathbf{0}\rangle_0 = 1$. Based on the derivation we have in the prerequisite for Hadamard operator, we have

\[\begin{align} H^{\otimes n}|\mathbf{0}\rangle &= \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} (-1)^{\langle \mathbf{i}, \mathbf{0} \rangle} | \mathbf{x}_i \rangle \\ &= \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} (-1)^{0} | \mathbf{x}_i \rangle \\ &= \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} | \mathbf{x}_i \rangle \\ \end{align}\]

Therefore, $H^{\otimes n}|\mathbf{0}\rangle$ is a superposition of all the basic states with equal probabilities.

Deutsch-Jozsa Algorithm
\[\begin{align} |\varphi_0\rangle &= |\mathbf{0}\rangle \otimes |1\rangle\\ &= |\mathbf{0}, 1\rangle \\ \end{align}\] \[\begin{align} |\varphi_1\rangle &= (H^{\otimes n} \otimes H) |\varphi_0\rangle \\ &= (H^{\otimes n} \otimes H) (|\mathbf{0}\rangle \otimes |1\rangle) \\ &= H^{\otimes n}|\mathbf{0}\rangle \otimes H |1\rangle \\ &= \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} | \mathbf{x}_i \rangle \otimes \frac{|0\rangle - |1\rangle}{\sqrt{2}}\\ \end{align}\] \[\begin{align} |\varphi_2\rangle &= U_f |\varphi_1\rangle \\ &= \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} U_f \Big( | \mathbf{x}_i \rangle \otimes \frac{|0\rangle - |1\rangle}{\sqrt{2}} \Big) \\ &= \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} \Big( (-1)^{f(\mathbf{x}_i)} |\mathbf{x}_i\rangle \otimes \frac{|0\rangle - |1\rangle}{\sqrt{2}} \Big) \\ &= \Big( \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} (-1)^{f(\mathbf{x}_i)} |\mathbf{x}_i\rangle \Big) \otimes \Big( \frac{|0\rangle - |1\rangle}{\sqrt{2}} \Big) \\ \end{align}\]

We could see that all the useful information about $f(\mathbf{x}_i)$ for $i \in [0, 2^n)$ is encoded in the phase of the first output. How could we extract these useful information?


If $f(\mathbf{x})$ is constant and $f(\mathbf{x}_i) = 0$ for $i \in [0, 2^n)$,

\[\begin{align} |\varphi_2\rangle &= \Big( \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} (-1)^{0} |\mathbf{x}_i\rangle \Big) \otimes \Big( \frac{|0\rangle - |1\rangle}{\sqrt{2}} \Big) \\ &= \Big( \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} |\mathbf{x}_i\rangle \Big) \otimes \Big( \frac{|0\rangle - |1\rangle}{\sqrt{2}} \Big) \\ \end{align}\]

If $f(\mathbf{x})$ is constant and $f(\mathbf{x}_i) = 1$ for $i \in [0, 2^n)$,

\[\begin{align} |\varphi_2\rangle &= \Big( \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} (-1)^{1} |\mathbf{x}_i\rangle \Big) \otimes \Big( \frac{|0\rangle - |1\rangle}{\sqrt{2}} \Big) \\ &= -\Big( \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} |\mathbf{x}_i\rangle \Big) \otimes \Big( \frac{|0\rangle - |1\rangle}{\sqrt{2}} \Big) \\ \end{align}\]

The first output remind us the last example we discussed in the prerequisites for Hadamard operator. If we apply $H^{\otimes {n}}$ to the qubits, it would result in $|\mathbf{0}\rangle$ or $-|\mathbf{0}\rangle$, which means the first output state is deterministic and it is $|\mathbf{0}\rangle$ or $-|\mathbf{0}\rangle$. After measurement, the first output will be $\mathbf{0}$, i.e, $n$ $0$s.


If $f(\mathbf{x})$ is constant and $f(\mathbf{x}_i) = 0$ for $i \in [0, 2^n)$,

\[\begin{align} |\varphi_3\rangle &= (H^{\otimes n} \otimes I) |\varphi_2\rangle \\ &= \bigg( H^{\otimes n} \otimes I \bigg) \bigg( \Big( \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} |\mathbf{x}_i\rangle \Big) \otimes \Big( \frac{|0\rangle - |1\rangle}{\sqrt{2}} \Big) \bigg) \\ &= \bigg( H^{\otimes n} \Big( \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} |\mathbf{x}_i\rangle \Big) \bigg) \otimes \bigg( I \frac{|0\rangle - |1\rangle}{\sqrt{2}} \bigg) \\ &= | \mathbf{0} \rangle \otimes \bigg( \frac{|0\rangle - |1\rangle}{\sqrt{2}} \bigg) \\ \end{align}\]

If $f(\mathbf{x})$ is constant and $f(\mathbf{x}_i) = 1$ for $i \in [0, 2^n)$, similarly,

\[\begin{align} |\varphi_3\rangle &= - | \mathbf{0} \rangle \otimes \bigg( \frac{|0\rangle - |1\rangle}{\sqrt{2}} \bigg) \\ \end{align}\]

Let’s further examine if there is a chance for the first output state to be $|\mathbf{0}\rangle$ or $-|\mathbf{0}\rangle$ when $f(\mathbf{x})$ is balanced.


If $f(\mathbf{x})$ is balanced,

\[\begin{align} |\varphi_3\rangle &= (H^{\otimes n} \otimes I) |\varphi_2\rangle \\ &= \bigg( H^{\otimes n} \otimes I \bigg) \bigg( \Big( \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} (-1)^{f(\mathbf{x}_i)} |\mathbf{x}_i\rangle \Big) \otimes \Big( \frac{|0\rangle - |1\rangle}{\sqrt{2}} \Big) \bigg) \\ &= \bigg( H^{\otimes n} \Big( \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} (-1)^{f(\mathbf{x}_i)} |\mathbf{x}_i\rangle \Big) \bigg) \otimes \bigg( I \frac{|0\rangle - |1\rangle}{\sqrt{2}} \bigg) \\ &= \bigg( H^{\otimes n} \Big( \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} (-1)^{f(\mathbf{x}_i)} |\mathbf{x}_i\rangle \Big) \bigg) \otimes \bigg( \frac{|0\rangle - |1\rangle}{\sqrt{2}} \bigg) \\ &= \bigg( \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} (-1)^{f(\mathbf{x}_i)} H^{\otimes n} |\mathbf{x}_i\rangle \bigg) \otimes \bigg( \frac{|0\rangle - |1\rangle}{\sqrt{2}} \bigg) \\ &= \Bigg( \frac{1}{\sqrt{2^n}} \sum_{i=0}^{2^n-1} \bigg( (-1)^{f(\mathbf{x}_i)} \frac{1}{\sqrt{2^n}} \sum_{k=0}^{2^n-1} (-1)^{\langle \mathbf{k}, \mathbf{i} \rangle} | z_k \rangle \bigg) \Bigg) \otimes \bigg( \frac{|0\rangle - |1\rangle}{\sqrt{2}} \bigg) \\ &= \bigg( \frac{1}{2^n} \sum_{i=0}^{2^n-1} \sum_{k=0}^{2^n-1} (-1)^{f(\mathbf{x}_i)} (-1)^{\langle \mathbf{k}, \mathbf{i} \rangle} | z_k \rangle \bigg) \otimes \bigg( \frac{|0\rangle - |1\rangle}{\sqrt{2}} \bigg) \\ &= \bigg( \frac{1}{2^n} \sum_{i=0}^{2^n-1} \sum_{k=0}^{2^n-1} (-1)^{f(\mathbf{x}_i) + \langle \mathbf{k}, \mathbf{i} \rangle}| z_k \rangle \bigg) \otimes \bigg( \frac{|0\rangle - |1\rangle}{\sqrt{2}} \bigg) \\ &= \bigg( \frac{1}{2^n} \sum_{i=0}^{2^n-1} \sum_{k=0}^{2^n-1} (-1)^{f(\mathbf{x}_i) \oplus \langle \mathbf{k}, \mathbf{i} \rangle}| z_k \rangle \bigg) \otimes \bigg( \frac{|0\rangle - |1\rangle}{\sqrt{2}} \bigg) \\ &= \Bigg( \frac{1}{2^n} \sum_{k=0}^{2^n-1} \bigg( \sum_{i=0}^{2^n-1} (-1)^{f(\mathbf{x}_i) \oplus \langle \mathbf{k}, \mathbf{i} \rangle} \bigg) | z_k \rangle \Bigg) \otimes \bigg( \frac{|0\rangle - |1\rangle}{\sqrt{2}} \bigg) \\ \end{align}\]

where $| \mathbf{z}_k \rangle$ is a quantum system one-hot basic state vector, $|\mathbf{z}_k\rangle = [z_0, z_1, \cdots, z_{2^{n}-1}]^{\top}$, where $z_k = 1$ and $z_{k^{\prime}} = 0$ for $k^{\prime} \neq k$.


So the probability of collapsing to $\mathbf{0}$ for the first output is, according to the basic quantum state theory,

\[\begin{align} p(\mathbf{0}) &= \frac{ | \sum_{i=0}^{2^n-1} (-1)^{f(\mathbf{x}_i) \oplus \langle \mathbf{0}, \mathbf{i} \rangle} |^2 }{ \sum_{k=0}^{2^n-1} | \sum_{i=0}^{2^n-1} (-1)^{f(\mathbf{x}_i) \oplus \langle \mathbf{k}, \mathbf{i} \rangle} |^2 } \\ &= \frac{ | \sum_{i=0}^{2^n-1} (-1)^{f(\mathbf{x}_i) \oplus 0} |^2 }{ \sum_{k=0}^{2^n-1} | \sum_{i=0}^{2^n-1} (-1)^{f(\mathbf{x}_i) \oplus \langle \mathbf{k}, \mathbf{i} \rangle} |^2 } \\ &= \frac{ | \sum_{i=0}^{2^n-1} (-1)^{f(\mathbf{x}_i) } |^2 }{ \sum_{k=0}^{2^n-1} | \sum_{i=0}^{2^n-1} (-1)^{f(\mathbf{x}_i) \oplus \langle \mathbf{k}, \mathbf{i} \rangle} |^2 } \\ \end{align}\]

Let’s check what value $\sum_{i=0}^{2^n-1} (-1)^{f(\mathbf{x}_i) }$ is. Because $f(\mathbf{x})$ is balanced

\[\begin{align} \sum_{i=0}^{2^n-1} (-1)^{f(\mathbf{x}_i)} &= 2^{n-1} (-1)^0 + 2^{n-1} (-1)^1 \\ &= 2^{n-1} - 2^{n-1} \\ &= 0 \\ \end{align}\]

Therefore, if $f(\mathbf{x})$ is balanced, it is impossible to observe $\mathbf{0}$ for the first output.

Conclusions

By running the above circuit for Deutsch-Jozsa algorithm once, we were able to determine whether $f(\mathbf{x})$ is constant or balanced.

Final Remarks

As I mentioned, in principle, $f(\mathbf{x})$ could be neither constant nor balanced. What will happen if $f(\mathbf{x})$ could be neither constant nor balanced in the Deutsch-Jozsa algorithm. Based on the probability of collapsing to $\mathbf{0}$ we computed above, if $f(\mathbf{x})$ is very close to constant but it is not exact, there will be chance collapsing to $\mathbf{0}$, but the chance is very small. As $f(\mathbf{x})$ becomes more close to balanced, the chance of observing $\mathbf{0}$ should not be considered negligible.

Quiz

We did not compute the denominator for $p(\mathbf{0})$ since it is not helpful for solving our particular problem. However, I would still like to ask what the value is for the denominator when $f(\mathbf{x})$ is constant and when $f(\mathbf{x})$ is balanced.

References