Radix Sort
Introduction
Radix sort is a non-comparative sorting algorithm that sorts numbers by processing individual digits. Radix means the “base” of a number system. For example, the decimal number system has a radix of 10, while the binary number system has a radix of 2. It works by sorting the input numbers based on each digit, starting from the least significant digit (LSD) to the most significant digit (MSD) or vice versa. Radix sort typically uses counting sort as a subroutine to sort the digits.
In this blog post, I would like to quickly discuss the radix sort algorithm and present a few implementations in Python and C++.
Counting Sort for Integer Numbers
The idea of counting sort is straightforward. We count the occurrences of each unique element in the input array and use this information to place each element directly into its correct position in the output array. Counting sort is efficient when the range of input values is not significantly larger than the number of elements to be sorted.
For example, consider the following array of integers:
1 | input_array = [4, 2, 2, 8, 3, 3, 1] |
We can create a count array to store the frequency of each integer:
1 | count = [0, 1, 2, 2, 1, 0, 0, 0, 1] # Index represents the integer value |
Next, we modify the count array to store the cumulative counts instead:
1 | count = [0, 1, 3, 5, 6, 6, 6, 6, 7] |
Finally, we build the output array by placing each element in its correct position based on the count array. This can be done iteratively.
1 | output_array = [-1, -1, -1, -1, -1, -1, -1] # Initialize output array |
Python Implementation
The counting sort algorithm can be implemented using Python as follows.
1 | def counting_sort(arr): |
1 | $ python counting_sort.py |
C++ Implementation
1 |
|
1 | $ g++ counting_sort.cpp -o counting_sort |
Radix Sort for Integer Numbers
Radix Sort for Non-Negative Integer Numbers
Radix sort is a non-comparative integer sorting algorithm that sorts numbers by processing individual digits. It works by sorting the input numbers based on each digit, starting from the least significant digit (LSD) to the most significant digit (MSD) or vice versa. Radix sort typically uses counting sort as a subroutine to sort the digits.
For example, consider the following array of integers:
1 | input_array = [170, 45, 75, 90, 802, 24, 2, 66] |
We can sort the array by processing each digit from the least significant to the most significant.
First, we sort the array based on the least significant digit (LSD) using counting sort:
1 | # After sorting by LSD |
Note that because counting sort is stable, the relative order of elements with the same digit is preserved. For example, 170 comes before 90 because it appeared first in the original array.
Next, we sort the array based on the next significant digit (the tens place) using counting sort. Those with no tens digit are considered to have a tens digit of 0.
1 | # After sorting by tens place |
Finally, we sort the array based on the most significant digit (the hundreds place) using counting sort. Similarly, those with no hundreds digit are considered to have a hundreds digit of 0.
1 | # After sorting by hundreds place |
Python Implementation
The radix sort algorithm can be implemented using Python as follows. This implementation assumes that all input numbers are non-negative integers.
1 | def counting_sort_for_radix(arr, exp): |
1 | $ python radix_sort.py |
C++ Implementation
1 |
|
1 | $ g++ radix_sort.cpp -o radix_sort |
Radix Sort for Integer Numbers
The aforementioned radix sort implementation works for non-negative integers. To make it work for all integers, we could scan the input array, find the minimum value, and shift all numbers by subtracting the minimum value before sorting. After sorting, we can shift the numbers back by adding the minimum value. This seems straightforward and will not affect the asymptotic time complexity of the algorithm.
The Correctness of Radix Sort
Radix sort is correct because it relies on the stability of the counting sort used as a subroutine. A stable sorting algorithm maintains the relative order of records with equal keys (or digits, in this case). By sorting the input array digit by digit from the least significant to the most significant, radix sort ensures that when we sort by a more significant digit, the order established by the less significant digits is preserved.
This means after sorting for those numbers which have the same number of digits they must be sorted correctly and for those which have different number of digits the sorting must be done correctly according to the number of digits. Thus, after processing all digits, the entire array is sorted correctly.
Radix Sort for Floating-Point Numbers
Radix Sort for Non-Negative Floating-Point Numbers
To extend radix sort to handle non-negative floating-point numbers, we can represent the non-negative floating-point numbers as unsigned integers using their binary representation. This will always work because the unsigned integer representation of non-negative floating-point numbers are monotonically increasing with respect to their floating-point values. Consequently, the order of non-negative floating-point numbers will be preserved when they are converted to their unsigned integer representations.
Radix Sort for Floating-Point Numbers
To extend radix sort to handle floating-point numbers, we will again represent the floating-point numbers as unsigned integers using their binary representation. However, we need to handle negative floating-point numbers carefully to ensure the correct order is preserved.
The trick is to flip the sign bit for all floating-point numbers, and for the originally negative floating-point numbers, we flip all the other bits as well. This is equivalent to flip the sign bit for positive floating-point numbers and flip all bits, i.e., one’s complement, for negative floating-point numbers. This transformation ensures that the unsigned integer representation of floating-point numbers is monotonically increasing with respect to their floating-point values, thus preserving the order.
This can be efficiently implemented using XOR operation in C++.
1 |
|
1 | $ g++ radix_sort_float_to_uint32.cpp -o radix_sort_float_to_uint32 |
C++ Implementation
1 |
|
1 | $ g++ radix_sort_float.cpp -o radix_sort_float |
Miscellaneous
Radix Sort Index Tracking
Sorting algorithms often need to return not only the sorted array but also the indices that would sort the array. This is particularly useful when we want to maintain the relationship between the sorted elements and their original positions. Because radix sort essentially performs multiple counting sorts, even though each counting sort can return a sorted index array straightforwardly, we need to carefully track the indices throughout the entire radix sort process.
In radix sort, we will maintain an auxiliary array that keeps track of the original indices of the elements. In each counting sort step, counting sort will return the a sorted index array for the array it processed. This returned indices can then be used to update our auxiliary array accordingly.
For example, suppose we have an array of four elements to be sorted by radix sort. The auxiliary index array will be initialized as [0, 1, 2, 3]. After the first counting sort, we get a sorted index array from counting sort, say [2, 0, 3, 1]. The auxiliary index array will then be updated to [2, 0, 3, 1]. In the next counting sort, we get a sorted index array, say [1, 0, 3, 2]. The auxiliary index array will then be updated to [0, 2, 1, 3].
Concretely, the update formula is as follows:
1 | for i in range(len(auxiliary_indices)): |
Radix Sort Asymptotic Complexity
It is obvious that the time complexity of counting sort is $\mathcal{O}(n + k)$, where $n$ is the number of elements in the input array and $k$ is the range of the input values. In radix sort, we perform counting sort for each digit of the maximum number in the input array. If $d$ is the number of digits in the maximum number, and we will have to perform counting sort $d$ times. Thus, the time complexity of radix sort is $\mathcal{O}(d \times (n + k))$. Because $k$ is usually pre-selected to be a constant, the time complexity of radix sort can be simplified to $\mathcal{O}(d \times n)$.
In practice, for fixed bit-width values, such as 32-bit or 64-bit integers or floating-point numbers, the number of digits $d$ is constant (e.g., for base 10, a 32-bit integer has at most 10 digits). Therefore, the time complexity of radix sort can be considered as linear, i.e., $\mathcal{O}(n)$. However, from the theoretical perspective, the numbers being sorted can be arbitrarily large, and thus $d$ cannot be treated as a constant. In such cases, the time complexity of radix sort remains $\mathcal{O}(d \times n)$.
This also means, when $n$ is very large, and $d$ remains constant or small, non-comparative sorting algorithms like radix sort can outperform comparative sorting algorithms like quicksort or mergesort, which have a time complexity of $\mathcal{O}(n \log n)$.