Comparison Based Sorting Algorithm Complexity Lower Bound

Introduction

Sorting algorithms are one of the most widely used and fundamental algorithms in computer programs. It’s very common to discuss the asymptotic complexity of different algorithms. However, for comparison based sorting algorithms, the asymptotic complexity lower bound is $O(N \log N)$ which is the asymptotic complexity of commonly used algorithms such as Merge sort and Heap sort.

In this blog post, I would like to quickly show a proof for the comparison based sorting algorithm asymptotic complexity lower bound.

Prerequisites

$O(\log(N!)) = O(N \log N)$

We would like to show that $O(\log(N!)) = O(N \log N)$.

It’s straightforward to show $O(\log(N!)) \leq O(N \log N)$.

$$
\begin{align}
\log(N!)
&= \log N + \log (N-1) + \log (N-2) + \cdots + \log 1 \\
&\leq \log N + \log N + \log N + \cdots + \log N \\
&= N \log N \\
\end{align}
$$

Thus,

$$
O(\log(N!)) \leq O(N \log N)
$$

To show $O(N \log N) \leq O(\log(N!))$, it’s equivalent to show $O(N \log N) \leq O(2\log(N!))$. Concretely,

$$
\begin{align}
2\log(N!) - N \log N
&= \left(\log N^2 - \log N \right) + \left(\log (N-1)^2 - \log N\right) + \left(\log (N-2)^2 - \log N\right) + \cdots + \left(\log (N-k)^2 - \log N\right) + \cdots + \left(\log 1^2 - \log N\right) \\
&= \left(\log \frac{N^2}{N} \right) + \left(\log \frac{(N-1)^2}{N} \right) + \left(\log \frac{(N-2)^2}{N}\right) + \cdots + \left(\log \frac{(N-k)^2}{N} \right) + \cdots + \left(\log \frac{1^2}{N} \right) \\
\end{align}
$$

Suppose $N$ is even and $N > 0$,

$$
\begin{align}
2\log(N!) - N \log N
&= \left(\log \frac{N^2}{N} \right) + \left(\log \frac{(N-1)^2}{N} \right) + \left(\log \frac{(N-2)^2}{N}\right) + \cdots + \left(\log \frac{(N-k)^2}{N} \right) + \cdots + \left(\log \frac{1^2}{N} \right) \\
&= \left(\log \frac{N^2}{N} + \log \frac{1^2}{N} \right) + \left(\log \frac{(N-1)^2}{N} + \log \frac{2^2}{N} \right) + \left(\log \frac{(N-2)^2}{N} + \log \frac{3^2}{N} \right) + \cdots + \left(\log \frac{(N-k)^2}{N} + \log \frac{(k+1)^2}{N}\right) + \cdots + \left(\log \frac{(N-\frac{N}{2})^2}{N} + \log \frac{(\frac{N}{2}+1)^2}{N}\right)\\
&= \left(\log \frac{N^2 1^2}{N^2} \right) + \left(\log \frac{(N-1)^2 2^2}{N^2} \right) + \left(\log \frac{(N-2)^2 3^2}{N^2} \right) + \cdots + \left(\log \frac{(N-k)^2 (k+1)^2}{N^2} \right) + \cdots + \left(\log \frac{(N-\frac{N}{2})^2(\frac{N}{2}+1)^2}{N^2} \right)\\
\end{align}
$$

For any integer $k$ that $0 \leq k \leq \frac{N}{2}$,

$$
\begin{align}
\frac{(N-k)^2 (k+1)^2}{N^2}
&= \left(\frac{(N-k)(k+1)}{N}\right)^2 \\
&= \left( -\frac{1}{N} k^2 + \left(1 - \frac{1}{N} \right) k + 1\right)^2 \\
\end{align}
$$

It’s not difficult to see that

$$
\min \left(-\frac{1}{N} k^2 + \left(1 - \frac{1}{N} \right) k + 1 \right) = 1
$$

when $k = 0$ and

$$
\max \left(-\frac{1}{N} k^2 + \left(1 - \frac{1}{N} \right) k + 1 \right) = \frac{N}{4} + \frac{1}{2} \geq 1
$$

when $k = \frac{N}{2}$.

Thus,

$$
\begin{align}
\frac{(N-k)^2 (k+1)^2}{N^2} \geq 1 \\
\end{align}
$$

and

$$
\begin{align}
\log \left( \frac{(N-k)^2 (k+1)^2}{N^2} \right) \geq 0 \\
\end{align}
$$

and further

$$
\begin{align}
2\log(N!) - N \log N \geq 0 \\
\end{align}
$$

and

$$
O(N \log N) \leq O(2\log(N!))
$$

i.e.,

$$
O(N \log N) \leq O(\log(N!))
$$

The same conclusion can be drawn similarly $N$ is odd and $N > 0$.

Finally, because $O(\log(N!)) \leq O(N \log N)$ and $O(N \log N) \leq O(\log(N!))$, we must have

$$
O\left(\log(N!)\right) = O\left(N \log N\right)
$$

This concludes the proof. $\square$

After proving this, I remembered that I can try ChatGPT out. Without very careful prompt engineering, or maybe I did not ask it using the most accurate math language, it seemed that ChatGPT can show $O(\log(N!)) \leq O(N \log N)$ easily, but it failed to show $O(N \log N) \leq O(\log(N!))$. Here is the ChatGPT chat history.

Comparison Based Sorting Algorithm Complexity Lower Bound

Given $N$ values to sort, the number of permutations for the $N$ values is $N!$. The sorted sequence must be one of the permutations.

In a comparison based sorting algorithm, we perform value comparisons. For each comparison, we can eliminate at most half of the permutations. Thus, to find out the sorted sequence, we need to perform at least $\log_{2}\left({N!}\right)$ comparisons.

Therefore, the comparison based sorting algorithm asymptotic complexity lower bound is $O\left(\log(N!)\right)$, which is just $O\left(N \log N\right)$.

This concludes the proof. $\square$

Author

Lei Mao

Posted on

06-12-2023

Updated on

06-12-2023

Licensed under


Comments