Lei Mao bio photo

Lei Mao

Machine Learning, Artificial Intelligence, Computer Science.

Twitter Facebook LinkedIn GitHub   G. Scholar E-Mail RSS

Introduction

In C programming, since there is no advanced data structure, to use hash table or hashmap, we would have to implement them by ourselves. In C++ programming, fortunately, there are standard containers or abstractions, such as std::unordered_map and std::unordered_set, that have been implemented for us.


In this blog post, I would like to discuss std::unordered_map, std::unordered_set, and some of the caveats.

HashMaps

Unordered Map

std::unordered_map is very like Python’s dict except that the type of key and value in std::unordered_map have to be uniform for all the keys and values, respectively.

#include <unordered_map>
#include <iostream>

int main()
{
    std::unordered_map<std::string, int> age;
    // Insert
    age["Michael"] = 16;
    age.insert(std::pair<std::string, int>{"Bill", 25});
    age.insert({"Chris", 30});

    // Search and change
    age["Michael"] = 18;
    age.at("Chris") = 27;

    // Check if key exists
    std::string query;
    query = "Eric";
    if (age.find(query) == age.end())
    {
        std::cout << query << " is not in the dictionary!" << std::endl;
    }

    // Delete
    query = "Michael";
    if (age.find(query) == age.end())
    {
        std::cout << query << " is not in the dictionary!" << std::endl;
    }
    age.erase(query);
    if (age.find(query) == age.end())
    {
        std::cout << query << " is not in the dictionary!" << std::endl;
    }

    // Iterate
    for (const std::pair<std::string, int>& tup : age)
    {
        std::cout << "Name: " << tup.first << std::endl;
        std::cout << "Age: " << tup.second << std::endl;
    }
}

Unordered Set

Sometimes, we would like to only care about the keys and the values are useless. Instead of putting a dummy value into std::unordered_map, we would just use std::unordered_set. Its usages are also similar to std::unordered_map.

#include <unordered_set>
#include <iostream>

int main()
{
    std::unordered_set<std::string> people;
    // Insert
    people.insert(std::string{"Michael"});
    people.insert("Chris");

    // Check if key exists
    std::string query;
    query = "Michael";
    if (people.find(query) == people.end())
    {
        std::cout << query << " is not in the dictionary!" << std::endl;
    }

    // Delete
    people.erase(query);
    if (people.find(query) == people.end())
    {
        std::cout << query << " is not in the dictionary!" << std::endl;
    }

    // Iterate
    for (const std::string& name : people)
    {
        std::cout << "Name: " << name << std::endl;
    }
}

Unhashable Types

To use std::unordered_map, and std::unordered_set, the key type has to be hashable by std::hash<Key>. What if the key type is unhashable?


According to the definition of the std::unordered_map, we could implement our own hash class for Hash in replace of the default std::hash<Key> in the template.

template<
    class Key,
    class T,
    class Hash = std::hash<Key>,
    class KeyEqual = std::equal_to<Key>,
    class Allocator = std::allocator< std::pair<const Key, T> >
> class unordered_map;

For example, std::tuple<int, std::string> is not supported by std::hash, so we implemented a functor CustomHash to hash the variables of this specific type.

#include <unordered_set>
#include <unordered_map>
#include <tuple>
#include <iostream>
#include <functional>

typedef std::tuple<int, char> nkey_t;

// Implement a hash functor
class CustomHash
{
public:
// Implement a hash function
std::size_t operator()(const nkey_t& k) const
{
    // This could be a bad hash function anyway
    std::size_t h1 = std::hash<int>{}(std::get<0>(k));
    std::size_t h2 = std::hash<char>{}(std::get<1>(k));
    return h1 ^ h2;
}
};

int main()
{
    std::unordered_map<nkey_t, std::string, CustomHash> umap;
    std::unordered_set<nkey_t, CustomHash> set;

    nkey_t key{1, 'a'};
    umap[key] = "AAA";
    set.insert(key);
}

Caveats

The usages of std::map the same to std::unordered_map. Simply replacing unordered_map to map in the last example would work. What’s different to std::unordered_map is that std::map preserves the order of the keys and it uses tree data structure instead of using hash. So it is not useful for latency demanding queries. In principle, we should not consider std::map as a hashmap.

#include <map>
#include <iostream>

int main()
{
    std::map<std::string, int> age;
    // Insert
    age["Michael"] = 16;
    age.insert(std::pair<std::string, int>{"Bill", 25});
    age.insert({"Chris", 30});

    // Search and change
    age["Michael"] = 18;
    age.at("Chris") = 27;

    // Check if key exists
    std::string query;
    query = "Eric";
    if (age.find(query) == age.end())
    {
        std::cout << query << " is not in the dictionary!" << std::endl;
    }

    // Delete
    query = "Michael";
    if (age.find(query) == age.end())
    {
        std::cout << query << " is not in the dictionary!" << std::endl;
    }
    age.erase(query);
    if (age.find(query) == age.end())
    {
        std::cout << query << " is not in the dictionary!" << std::endl;
    }

    // Iterate
    for (const std::pair<std::string, int>& tup : age)
    {
        std::cout << "Name: " << tup.first << std::endl;
        std::cout << "Age: " << tup.second << std::endl;
    }
}

Average Complexities

Search Insert Delete
std::unordered_map O(1) O(1) O(1)
std::map O(log n) O(log n) O(log n)
std::unordered_set O(1) O(1) O(1)

Conclusions

Making good use of std::unordered_map and std::unordered_set would make your development in C++ much more efficient.