Lei Mao bio photo

Lei Mao

Machine Learning, Artificial Intelligence, Computer Science.

Twitter Facebook LinkedIn GitHub   G. Scholar E-Mail RSS

Introduction

The Hungarian matching algorithm is a combinatorial optimization algorithm that solves the assignment linear-programming problem in polynomial time. The assignment problem is an interesting problem and the Hungarian algorithm is difficult to understand.


In this blog post, I would like to talk about what assignment is and give some intuitions behind the Hungarian algorithm.

Minimum Cost Assignment Problem

Problem Definition

In the matrix formulation, we are given a nonnegative $n \times n$ cost matrix, where the element in the $i$-th row and $j$-th column represents the cost of assigning the $j$-th job to the $i$-th worker. We have to find an assignment of the jobs to the workers, such that each job is assigned to one worker, each worker is assigned one job, and the total cost of assignment is minimum.


Finding a brute-force solution for this problem takes $O(n!)$ because the number of valid assignments is $n!$. We really need a better algorithm, which preferably takes polynomial time, to solve this problem.

Maximum Cost Assignment VS Minimum Cost Assignment

Solving a maximum cost assignment problem could be converted to solving a minimum cost assignment problem, and vice versa. Suppose the cost matrix is $c$, solving the maximum cost assignment problem for cost matrix $c$ is equivalent to solving the minimum cost assignment problem for cost matrix $-c$. So we will only discuss the minimum cost assignment problem in this article.

Non-Square Cost Matrix

In practice, it is common to have a cost matrix which is not square. But we could make the cost matrix square, fill the empty entries with $0$, and apply the Hungarian algorithm to solve the optimal cost assignment problem.

Hungarian Matching Algorithm

Algorithm

Brilliant has a very good summary on the Hungarian algorithm for adjacency cost matrix. Let’s walk through it.

  1. Subtract the smallest entry in each row from all the other entries in the row. This will make the smallest entry in the row now equal to 0.
  2. Subtract the smallest entry in each column from all the other entries in the column. This will make the smallest entry in the column now equal to 0.
  3. Draw lines through the row and columns that have the 0 entries such that the fewest lines possible are drawn.
  4. If there are nn lines drawn, an optimal assignment of zeros is possible and the algorithm is finished. If the number of lines is less than nn, then the optimal number of zeroes is not yet reached. Go to the next step.
  5. Find the smallest entry not covered by any line. Subtract this entry from each row that isn’t crossed out, and then add it to each column that is crossed out. Then, go back to Step 3.

Note that we could not tell the algorithm time complexity from the description above. The time complexity is actually $O(n^3)$ where $n$ is the side length of the square cost adjacency matrix.

Example

The Brilliant Hungarian algorithm for adjacency cost matrix also comes with a good example. We slightly modified the example by making the cost matrix non-square.

Company Cost for Musician Cost for Chef Cost for Cleaners
Company A 108 125 150
Company B 150 135 175

To avoid duplicating the solution on Brilliant, instead of solving it manually, we will use the existing SciPy linear sum assignment optimizer to solve, and verified using a brute force solver.

# hungarian.py
from typing import List, Tuple
import itertools
import numpy as np
from scipy.optimize import linear_sum_assignment


def linear_sum_assignment_brute_force(
        cost_matrix: np.ndarray,
        maximize: bool = False) -> Tuple[List[int], List[int]]:

    h = cost_matrix.shape[0]
    w = cost_matrix.shape[1]

    if maximize is True:
        cost_matrix = -cost_matrix

    minimum_cost = float("inf")

    if h >= w:
        for i_idices in itertools.permutations(list(range(h)), min(h, w)):
            row_ind = i_idices
            col_ind = list(range(w))
            cost = cost_matrix[row_ind, col_ind].sum()
            if cost < minimum_cost:
                minimum_cost = cost
                optimal_row_ind = row_ind
                optimal_col_ind = col_ind
    if h < w:
        for j_idices in itertools.permutations(list(range(w)), min(h, w)):
            row_ind = list(range(h))
            col_ind = j_idices
            cost = cost_matrix[row_ind, col_ind].sum()
            if cost < minimum_cost:
                minimum_cost = cost
                optimal_row_ind = row_ind
                optimal_col_ind = col_ind

    return optimal_row_ind, optimal_col_ind


if __name__ == "__main__":

    cost_matrix = np.array([[108, 125, 150], [150, 135, 175]])

    row_ind, col_ind = linear_sum_assignment_brute_force(
        cost_matrix=cost_matrix, maximize=False)

    minimum_cost = cost_matrix[row_ind, col_ind].sum()

    print(
        f"The optimal assignment from brute force algorithm is: {list(zip(row_ind, col_ind))}."
    )
    print(f"The minimum cost from brute force algorithm is: {minimum_cost}.")

    row_ind, col_ind = linear_sum_assignment(cost_matrix=cost_matrix,
                                             maximize=False)

    minimum_cost = cost_matrix[row_ind, col_ind].sum()

    print(
        f"The optimal assignment from Hungarian algorithm is: {list(zip(row_ind, col_ind))}."
    )
    print(f"The minimum cost from Hungarian algorithm is: {minimum_cost}.")
$ python hungarian.py 
The optimal assignment from brute force algorithm is: [(0, 0), (1, 1)].
The minimum cost from brute force algorithm is: 243.
The optimal assignment from Hungarian algorithm is: [(0, 0), (1, 1)].
The minimum cost from Hungarian algorithm is: 243.

Key Idea

If a number is added to or subtracted from all of the entries of any one row or column of a cost matrix $c$ (all the values in $c$ are non-negative), then an optimal assignment for the resulting cost matrix is also an optimal assignment for the original cost matrix.


Here is the way to understand this. The cost of each assignment consists of two parts, the cost from one of the entries in the selected row and the cost from the entries in the other rows. Suppose the original optimal (maximum or minimum) cost of the original optimal assignment is $c$, and $c > c^{\prime}$ if the optimal is maximum and $c < c^{\prime}$ if the optimal is minimum where $c^{\prime}$ is any of the costs of the non-optimal assignments. If a number $a$ is added to or subtracted from the selected row, the original optimal cost $c$ now becomes $c + a$ if it was addition, and $c - a$ if it was subtraction. In addition, any non-optimal cost $c^{\prime}$ now becomes $c^{\prime} + a$ if it was addition, and $c^{\prime} - a$ if it was subtraction. We still have $c + a > c^{\prime} + a$ or $c - a > c^{\prime} - a$ if the optimal is maximum and $c + a < c^{\prime} + a$ or $c - a < c^{\prime} - a$ if the optimal is minimum. This means a number is added to or subtracted from the selected row, the optimal assignment is still the optimal. Similarly, we could understand it for adding or subtracting a number from a selected column.

Intuitions

Given a $m \times m$ cost matrix $c$ and $m \times m$ assignment matrix $x$, we define $c_{ij}$ as the cost of assigning worker $i$ to task $j$, and $x_{ij} = 1$ if worker $i$ is assigned to task $j$, otherwise $x_{ij} = 0$. So the minimum cost assignment problem could be formulated as a linear programming problem mathematically as follows.

\[\DeclareMathOperator*{\argmin}{argmin} \begin{align} &\argmin_{x} \sum_{i=1}^{m} \sum_{j=1}^{m} c_{ij} x_{ij} \\ \end{align}\]

subject to a valid assignment $x$

\[\begin{align} &\sum_{i=1}^{m} x_{ij} = 1 \\ &\sum_{j=1}^{m} x_{ij} = 1 \\ &x_{ij} \in \{0, 1\} \\ \end{align}\]

Note that for maximum cost assignment problem, it could be formulated as the following minimization problem as well.

\[\DeclareMathOperator*{\argmin}{argmin} \begin{align} &\argmin_{x} \sum_{i=1}^{m} \sum_{j=1}^{m} -c_{ij} x_{ij} \\ \end{align}\]

we further define

\[\begin{align} z &\equiv \sum_{i=1}^{m} \sum_{j=1}^{m} c_{ij} x_{ij} \\ \end{align}\]

If somehow we have a series of magic values, $u = \{u_1, u_2, \cdots, u_m\}$, and $v = \{v_1, v_2, \cdots, v_m\}$, such that

\[\begin{align} c_{ij} - u_i - v_j \geq 0 \quad \forall i \in [1, m], j \in [1, m] \end{align}\]

and

\[\begin{align} c_{ij} - u_i - v_j = 0 \quad \forall (i, j) \in \text{a valid assignment $x^0$ such that } x_{ij}^0 = 1 \end{align}\]

We would like to plug in the values $u = \{u_1, u_2, \cdots, u_m\}$, and $v = \{v_1, v_2, \cdots, v_m\}$ to $z$. Note that the behavior of subtracting values from rows and columns in the Hungarian algorithm is encoded in this step.

\[\begin{align} z &= \sum_{i=1}^{m} \sum_{j=1}^{m} c_{ij} x_{ij} \\ &= \sum_{i=1}^{m} \sum_{j=1}^{m} (c_{ij} - u_i - v_j + u_i + v_j) x_{ij} \\ &= \sum_{i=1}^{m} \sum_{j=1}^{m} (c_{ij} - u_i - v_j) x_{ij} + \sum_{i=1}^{m} \sum_{j=1}^{m} u_i x_{ij} + \sum_{i=1}^{m} \sum_{j=1}^{m} v_j x_{ij} \\ &= \sum_{i=1}^{m} \sum_{j=1}^{m} (c_{ij} - u_i - v_j) x_{ij} + \sum_{i=1}^{m} u_i \sum_{j=1}^{m} x_{ij} + \sum_{j=1}^{m} v_j \sum_{i=1}^{m} x_{ij} \\ &= \sum_{i=1}^{m} \sum_{j=1}^{m} (c_{ij} - u_i - v_j) x_{ij} + \sum_{i=1}^{m} u_i \times 1 + \sum_{j=1}^{m} v_j \times 1 \\ &= \sum_{i=1}^{m} \sum_{j=1}^{m} (c_{ij} - u_i - v_j) x_{ij} + \sum_{i=1}^{m} u_i + \sum_{j=1}^{m} v_j \\ \end{align}\]

We see that different valid assignment will change the value for $\sum_{i=1}^{m} \sum_{j=1}^{m} (c_{ij} - u_i - v_j) x_{ij}$, but the value for $\sum_{i=1}^{m} u_i + \sum_{j=1}^{m} v_j$ remains unaffected. So minimizing $z$ is equivalent to minimizing $\sum_{i=1}^{m} \sum_{j=1}^{m} (c_{ij} - u_i - v_j) x_{ij}$.


Because

\[\begin{align} c_{ij} - u_i - v_j \geq 0 \quad \forall i \in [1, m], j \in [1, m] \end{align}\]

and

\[\begin{align} c_{ij} - u_i - v_j = 0 \quad \forall (i, j) \in \text{a valid assignment $x^0$ such that } x_{ij}^0 = 1 \end{align}\]

We have $\sum_{i=1}^{m} \sum_{j=1}^{m} (c_{ij} - u_i - v_j) x_{ij} \geq 0$.


If $x = x^0$, $\sum_{i=1}^{m} \sum_{j=1}^{m} (c_{ij} - u_i - v_j) x_{ij} = 0$, which is the minimum value it can achieve.


Therefore, we have found the optimal assignment $x = x^0$, and the minimum cost is $\sum_{i=1}^{m} u_i + \sum_{j=1}^{m} v_j$.


This partially explains why Hungarian algorithm works. However, it does not explain why the magic values exists and how the magic values are determined. The rationale behind this is related to the duality of a linear programming optimization problem. In our case, $x$ is the primal variables and $u$ and $v$ are dual variables. More details could be found from Katta G. Murty’s book “Network Programming” Chapter 3.1, which looks very complicated though. Frankly, I really need to find some time to go through this chapter.

Final Remarks

I rarely applaud for deep learning algorithms. Conventional algorithms such as the Hungarian matching algorithm are truly amazing.

References