C++ STL
Introduction
The Standard Template Library (STL) is a software library originally designed by Alexander Stepanov for the C++ programming language that influenced many parts of the C++ Standard Library. It provides four components called containers, algorithms, iterators, and functions. Later, the STL has been expanded routinely but the conventions followed by STL remain the same.
In this blog post, I would like to discuss the C++ STL.
STL and Conventions
The STL is held by conventions.
- Containers represent sequences of values, even though the values do not necessarily have to be consecutive on memory.
- Containers provide iterators that know how to traverse the sequences they represent.
- Algorithms use iterators to define the sequences of values on which they should operate.
- Algorithms consume functions to become more versatile for different use cases.
These conventions cover all the four fundamental components of STL, containers, algorithms, iterators, and functions.
So what is STL? Jon Kalb’s informal definition is
Anything in the standard C++ library supporting iterators.
Because containers provide iterators and algorithms use iterators in the C++ standard library.
Containers
A container is a holder object that stores a collection of other objects (its elements). They are implemented as class templates, which allows a great flexibility in the types supported as elements.
Iterators
In the STL, pointer-like objects are known as iterators.
Iterators can be thought of as restricted generalized pointers. They are restricted, because they do not necessarily support all pointer
operations. For example, sometimes, it does not have to support both ==
and !=
, and sometimes, it does not have to support --
. They are generalized, because they do not have to be real pointers. Often the time, they are implemented as nested class inside the container class.
Algorithms
Algorithms use iterators to specify half-open ranges defining the sequences of values on which the algorithms should operate.
Algorithms could be general for most or any of the STL containers but are less optimized. For example, std::find
is a general search algorithm for finding a value from all the STL containers. Its complexity is $O(N)$.
Algorithms could also be specialized for certain STL containers and are highly optimized. For example, std::unordered_map<Key,T,Hash,KeyEqual,Allocator>::find
is a specialized search algorithm for finding a value from the STL container std::unordered_map<Key,T,Hash,KeyEqual,Allocator>
. Its complexity is $O(1)$ on average.
Because containers are often template classes, so algorithms are also often template functions.
Functions
In practice, instead of using conventional functions (pointers to functions), we use function object and lambda expression for STL algorithm, mainly because their performances and programming experiences are much better.
Why Function Object?
Function object, when passing to a function, can be optimized as inline
by the compiler, because the compiler knows explicitly the definition of the object at compile time and therefore applies code optimizations accordingly.
For example, we have the following two implementations for find_if
. The former implementation takes a function pointer as predicator and the latter implementation takes a template class object as predicator.
1 | template <class It, class T> |
1 | template <class It, class Pred> |
Why function object over pointer to function in template function?
It is more general. The former one restricts that the return type must be
bool
, whereas the latter one allows the return type to be anything that is convertible tobool
. The input type must beT const&
and not anything that is convertible toT const&
. The former one can only take pointer to function, but the latter one can take almost anything, including pointer to function, function object, and closure.Function object calls can be
inline
optimized in template function, because the compiler has to know explicitly the definitions of the object and the template function at compile time and therefore generates specialized code with optimizations accordingly. Say, we have 3 different types of function objects used infind_if
, the compiler will generate 3 different specialized code blocks for the 3 different types of function objects. Pointer to function, however, cannot beinline
optimized in template function, because the compiler will only in principle check the type of the pointer matches and rarely any optimization has to be applied. In this case, the compiler will generate one code block only. The performance of the one with function object is extremely likely to be faster than the one with pointer to function, because of theinline
optimization.Function objects with parameters avoid code bloat. Suppose we have to create 10 thousands of different predicator for
find_if
to see if there is any value in the container matches certain value. Without using function objects with parameters, we will have to create 10 thousands of different functions, resulting in code bloat. Even with template function, the compiler will generate 10 thousands of different functions, still resulting in code bloat.
Manually implementing functions,
1 | bool equal_1(int val) |
Even with non-type value template functions,
1 | template <int target> |
With function objects with parameters, only one copy of the class member functions will be generated, no matter how many objects are created.
1 | class Equal |
Why Lambda Expression?
Similar to function object, lambda expression also supports type deduction and it could be inline
optimized in template function as well. Essentially, lambda expression have all the advantages that the function object has. Besides, it is self expressive.
Summary
- There are four fundamental components in the STL, containers, iterators, algorithms, and functions (function objects and lambda expressions).
- Containers are class templates.
- Data was managed by the container.
- Iterators are pointer-like objects.
- Pairs are often used to specify half-open ranges on which algorithms should work.
- The “one past the end” iterator, i.e., the iterator returned by
std::end
, is often used to signal failure.
- Algorithms are function templates.
- They often encapsulate loops, e.g.,
find
andfind_if
.
- They often encapsulate loops, e.g.,
- Function objects are function-like objects.
- They are generated by classes that overload
operator()
.
- They are generated by classes that overload
- The STL is held together by conventions.
- This allows for flexibility and extensibility.