Half-Open Range for Algorithm Implementation
Introduction
Consider implementing a binary search algorithm for a sorted $0$-indexed array of size $n$, we would use index to retrieve the value from the array and compare with the target value $v$. We will start from the two-ends of the array with two indices $i$ and $j$. There are usually two choices for the array representation.
- Half-open range. An array is represented using $[i, j)$, i.e., elements with index in $[i, j-1]$ belong to the array.
- None-open range. An array is represented using $[i, j]$, i.e., elements with index in $[i, j]$ belong to the array.
These two array representations could also be extended to the implementation of other algorithms. Further more, the index notion could be somewhat translated more generally to C/C++ pointers and iterators used in various kind of containers.
In this blog post, I would like to discuss the difference between the half-open range representation and the none-open range representation using the binary search algorithm implementations as an example and which representation we should use in practice.
Binary Search Algorithm
Implementations
Let’s see how we will implement the binary search algorithm using the half-open range representation and the none-open range representation.
1 |
|
Half-Open Range VS None-Open Range
It turns out that the none-open range implementation is logically more complicated than the half-open range implementation. When I was creating the implementations, I completed half-open range implementation pretty quickly. However, I spent much more time on the none-open range implementation and I failed the simple unit tests a couple of times before I finally implemented it correctly.
Fundamentally, that the none-open range implementation is more complicated is because there is no way for none-open range to represent an empty array. For the half-open range array representation, $[i, j)$ is a valid representation for an array for $0 \leq i \leq j \leq n$. In particular, when $i = j$, $[i, j)$ represents an empty array. However, for the none-open range array representation, $[i, j]$ is a valid representation for an array for $0 \leq i \leq j \leq n - 1$. But there is no way to represent an empty array mathematically. When $i = j$, $[i, j]$ always represents an array containing one element.
Conclusions
Always use the half-open range representation for all different kind of containers and their related algorithms. In fact, the notion of half-open range is fundamental to C++ STL and other programming languages.
Half-Open Range for Algorithm Implementation