# Travelling Salesman Problem

## Introduction

The classical travelling salesman problem (TSP) asks the following question: “Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?” It is a classical NP-hard problem in combinatorial optimization, important in theoretical computer science and operations research.

It turns out that it could be formulated as a linear programming problem. In this blog post, I would like to implement a linear programming solver for the TSP problem.

## Travelling Salesman Problem

### Miller-Tucker-Zemlin (MTZ) Integer Linear Programming Formulation

There are two notable TSP integer linear programming formulations, Miller-Tucker-Zemlin (MTZ) formulation and the Dantzig-Fulkerson-Johnson (DFJ) formulation. We will be focusing on the MTZ formulation, as it is easier to program using the existing integer linear programming library.

Suppose we have $n$ cities, from $1$ to $n$. We define a binary square matrix $X \in \{0,1\}^{n \times n}$, where

$$x_{ij} = \begin{cases} 1 & \text{if the path goes from city i to city j}\\ 0 & \text{otherwise}\\ \end{cases} \\$$

and a distance square matrix $C \in \mathbb{R}^{n \times n}$, where $c_{ij}$ is the distance from city $i$ to city $j$.

It is trivial to construct the integer linear programming optimization problem with the following constraints for the TSP problem.

$$\DeclareMathOperator*{\argmin}{argmin} \DeclareMathOperator*{\argmax}{argmax} \argmin_{x_{ij}, \forall i,j \in [1,n]} \sum_{i=1}^{n} \sum_{j \neq i, j=1}^{n} c_{ij} x_{ij}$$

subject to

$$\begin{gather} x_{ij} \in {0,1} \quad \forall i,j \in [1,n] \\ \sum_{i=1, i \neq j}^{n} x_{ij} = 1 \quad \forall j \in [1,n] \\ \sum_{j=1, j \neq i}^{n} x_{ij} = 1 \quad \forall i \in [1,n] \\ \end{gather}$$

However, with these constraints, we will likely have subtours in our solution, which violates the TSP setting that the travelling salesman has to return to the original city. Subtours are just the incorrect TSP solution which contains several directed cycles. For example, suppose we have six cities, $1, 2, 3, 4, 5, 6$, without additional constraints, $x_{1,2}=1, x_{2,3}=1, x_{3,1}=1, x_{4,5}=1, x_{5,6}=1, x_{6,4}=1$ ($1 \rightarrow 2 \rightarrow 3 \rightarrow 1$, $4 \rightarrow 5 \rightarrow 6 \rightarrow 4$) are one of the valid solutions for optimization but they are incorrect for the TSP settings.

To prevent subtours, Miller, Tucker, and Zemlin introduces additional auxiliary integer variables $u_2, u_3, \cdots, u_n$ and additional constraints.

$$\begin{gather} u_i \in \mathbb{Z} \quad \forall i \in [2,n] \\ 1 \leq u_i \leq n - 1 \quad 2 \leq i \leq n \\ u_i - u_j + n x_{ij} \leq n - 1 \quad 2 \leq i \neq j \leq n \\ \end{gather}$$

With the constraints, if there are more than one directed cycles, at least one cycle will not contain node $1$ and the inequality of the $u$ variables will not be satisfied. For the previous incorrect TSP solution, we will have $u_4 < u_5 < u_6 < u_4$.

Also notice that linear function is both convex and concave (please show a simple proof on your own), which implies that every local optimum for the linear function is a global optimum. This means that the optimum solution from linear programming solver for the TSP problem is global optimum.

### Install Dependencies

We will install and use pulp and glpk linear programming solvers to solve the TSP problems.

### TSP Integer Linear Programming Solver Python Implementation

We implemented a brute force TSP solver and an integer linear programming TSP solver, and verified the correctness of the two solvers on symmetric TSP problems. The two solvers could also be used for asymmetric TSP problems.

### TSP Problem Solver Execution

Running the following program will print out the solution for the TSP problem in the terminal.

## Conclusion

Although the linear programming solver is much faster ($100\times$) than the brute force solver for the TSP problems in Python when the number of nodes is small, the linear programming solver will still haunt as the number of nodes becomes larger ($\geq 23$) because the number of variables and constraints grow dramatically.

Lei Mao

07-20-2021

07-20-2021