Graph is a common data structure that represents the relationships between different entities using edges and vertices. In practice, a graph can be extremely large, and there is a need to traverse or search the graph along the edges. Breadth-first search and depth-first search algorithms are the most commonly used traverse and search algorithm for graphs.
In this blog post, I would like to discuss the implementations of directed and undirected graphs, breath-first and depth-first search algorithms.
Graph Search and Traversal
Graph search and traversal are fundamentally the same. We will use the terms, search and traversal, interchangeably.
Graph Representation and Implementation
Graph is commonly represented using adjacency matrix or adjacency list. Adjacency matrix is not favored for large graphs because usually the edges in large graphs are sparse. Adjacency list uses the least amount of the memory, but querying the specified neighbor of a node can be inefficient in a graph whose edges are dense. So we uses adjacency set instead which is fast for query and not too bad for memory.
Graph can be directed and undirected. There could also be isolated vertices in the graph that have no edges. There could also be vertices that connects with themselves. These graph features should not be ignored and have to be implemented accordingly.
if u notin self.children: raise RuntimeError(f"Node {u} does not exist in the graph.")
return self.children[u]
Breadth-First Search (BFS)
The breadth-first search algorithm explores all the nodes at the current level before it explores the next level. To achieve this, it uses a first-in-first-out (FIFO) queue.
# Graph search/traversal algorithms. # These algorithms could be modified to achieve different goals for graphs. from typing import Iterable, List, Union, Set from queue import Queue from graph import Graph, DirectedGraph, UndirectedGraph
# BFS supports multiple start nodes. # O(V + E) where V is the number of nodes and E is the number of edges.
search_order = []
num_nodes = graph.get_num_nodes() # First in first out (FIFO) q = Queue(maxsize=num_nodes)
# Store the nodes that have been visited. visited = set()
# Mark the source node as visited and enqueue it. for node in start_nodes: q.put(node) visited.add(node)
whilenot q.empty(): src_node = q.get() search_order.append(src_node) for tgt_node in graph.get_neighbors(src_node): if tgt_node notin visited: q.put(tgt_node) visited.add(tgt_node)
return search_order
Depth-First Search (DFS)
The depth-first search algorithm explores as far as possible along each branch before backtracking. It can be implemented recursively or iteratively. The recursive algorithm naturally explores the depth first but is not favored for large graphs because of stack overflow. The iterative algorithm uses a last-in-first-out (LIFO) stack to explore the depth first.
# Recursive algorithm is not favored for large graphs because of stack overflow. # O(V + E) where V is the number of nodes and E is the number of edges.
visited.add(src_node) search_order.append(src_node) for tgt_node in graph.get_neighbors(src_node): if tgt_node notin visited: depth_first_recursion(graph=graph, visited=visited, src_node=tgt_node, search_order=search_order)
# Store the nodes that have been visited. visited = set() search_order = []
if start_node notin visited: depth_first_recursion(graph=graph, visited=visited, src_node=start_node, search_order=search_order)