 ### Lei Mao

Machine Learning, Artificial Intelligence, Computer Science.

# Travelling Salesman Problem

### Introduction

The classical travelling salesman problem (TSP) asks the following question: “Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?” It is a classical NP-hard problem in combinatorial optimization, important in theoretical computer science and operations research.

It turns out that it could be formulated as a linear programming problem. In this blog post, I would like to implement a linear programming solver for the TSP problem.

### Travelling Salesman Problem

#### Miller-Tucker-Zemlin (MTZ) Integer Linear Programming Formulation

There are two notable TSP integer linear programming formulations, Miller-Tucker-Zemlin (MTZ) formulation and the Dantzig-Fulkerson-Johnson (DFJ) formulation. We will be focusing on the MTZ formulation, as it is easier to program using the existing integer linear programming library.

Suppose we have $n$ cities, from $1$ to $n$. We define a binary square matrix $X \in \{0,1\}^{n \times n}$, where

$x_{ij} = \begin{cases} 1 & \text{if the path goes from city i to city j}\\ 0 & \text{otherwise}\\ \end{cases} \\$

and a distance square matrix $C \in \mathbb{R}^{n \times n}$, where $c_{ij}$ is the distance from city $i$ to city $j$.

It is trivial to construct the integer linear programming optimization problem with the following constraints for the TSP problem.

$\DeclareMathOperator*{\argmin}{argmin} \DeclareMathOperator*{\argmax}{argmax} \argmin_{x_{ij}, \forall i,j \in [1,n]} \sum_{i=1}^{n} \sum_{j \neq i, j=1}^{n} c_{ij} x_{ij}$

subject to

$\begin{gather} x_{ij} \in \{0,1\} \quad \forall i,j \in [1,n] \\ \sum_{i=1, i \neq j}^{n} x_{ij} = 1 \quad \forall j \in [1,n] \\ \sum_{j=1, j \neq i}^{n} x_{ij} = 1 \quad \forall i \in [1,n] \\ \end{gather}$

However, with these constraints, we will likely have subtours in our solution, which violates the TSP setting that the travelling salesman has to return to the original city. Subtours are just the incorrect TSP solution which contains several directed cycles. For example, suppose we have six cities, $1, 2, 3, 4, 5, 6$, without additional constraints, $x_{1,2}=1, x_{2,3}=1, x_{3,1}=1, x_{4,5}=1, x_{5,6}=1, x_{6,4}=1$ ($1 \rightarrow 2 \rightarrow 3 \rightarrow 1$, $4 \rightarrow 5 \rightarrow 6 \rightarrow 4$) are one of the valid solutions for optimization but they are incorrect for the TSP settings.

To prevent subtours, Miller, Tucker, and Zemlin introduces additional auxiliary integer variables $u_2, u_3, \cdots, u_n$ and additional constraints.

$\begin{gather} u_i \in \mathbb{Z} \quad \forall i \in [2,n] \\ 1 \leq u_i \leq n - 1 \quad 2 \leq i \leq n \\ u_i - u_j + n x_{ij} \leq n - 1 \quad 2 \leq i \neq j \leq n \\ \end{gather}$

With the constraints, if there are more than one directed cycles, at least one cycle will not contain node $1$ and the inequality of the $u$ variables will not be satisfied. For the previous incorrect TSP solution, we will have $u_4 < u_5 < u_6 < u_4$.

Also notice that linear function is both convex and concave (please show a simple proof on your own), which implies that every local optimum for the linear function is a global optimum. This means that the optimum solution from linear programming solver for the TSP problem is global optimum.

#### Install Dependencies

We will install and use pulp and glpk linear programming solvers to solve the TSP problems.

sudo apt-get update
sudo apt-get install glpk-utils

pip install pulp==2.4


#### TSP Integer Linear Programming Solver Python Implementation

We implemented a brute force TSP solver and an integer linear programming TSP solver, and verified the correctness of the two solvers on symmetric TSP problems. The two solvers could also be used for asymmetric TSP problems.

from itertools import permutations
import random
import math
from timeit import default_timer as timer
from datetime import timedelta
from typing import List, Tuple, Union, Callable
import numpy as np
from pulp import LpMaximize, LpMinimize, LpProblem, LpStatus, lpSum, LpVariable, LpBinary, LpInteger, lpDot
from pulp import GLPK

def idx_2d_to_1d(i: int, j: int, width: int) -> int:

# i, j are in 1-indexed
# The returned index are 0 indexed.

return (i - 1) * width + (j - 1)

def assemble_tsp_solution(non_zero_idx: List[Tuple[int, int]]) -> List[int]:

# The TSP problem ensures that each city is visited exactly once.

# For example,
# [(2, 4), (3, 1), (4, 3), (1, 2)] -> [1, 2, 4, 3]
# Set the starting node to be 1 arbitrarily.

n = len(non_zero_idx)
tsp_solution = 

src_tgt_map = {}
for idx in non_zero_idx:
src_tgt_map[idx] = idx

src = 1
while len(tsp_solution) < n:
tgt = src_tgt_map[src]
tsp_solution.append(tgt)
src = tgt
if len(tsp_solution) == n:
assert src_tgt_map[src] == 1

return tsp_solution

def verify_symmetric_tsp_solution_equality(tsp_solution_1: List[int],
tsp_solution_2: List[int]) -> bool:

return (tsp_solution_1[:1] + tsp_solution_1[:0:-1]
) == tsp_solution_2 or tsp_solution_1 == tsp_solution_2

def tsp_brute_force_solver(
distance_matrix: np.ndarray) -> Tuple[List[int], float]:

assert distance_matrix.ndim == 2
assert distance_matrix.shape == distance_matrix.shape

n = distance_matrix.shape

perms = permutations(range(2, n + 1))

tsp_solution = None
optimal_cost = float("inf")

for perm in perms:
perm =  + list(perm)
cost = 0
for i in range(n):
j = (i + 1) % n
cost += distance_matrix[perm[i] - 1][perm[j] - 1]
if cost < optimal_cost:
optimal_cost = cost
tsp_solution = perm

return tsp_solution, optimal_cost

def tsp_mtz_integer_linear_programming_solver(
distance_matrix: np.ndarray) -> Tuple[List[int], float]:

# TSP MTZ formulation
# https://en.wikipedia.org/wiki/Travelling_salesman_problem#Miller%E2%80%93Tucker%E2%80%93Zemlin_formulation

assert distance_matrix.ndim == 2
assert distance_matrix.shape == distance_matrix.shape

n = distance_matrix.shape

distance_list = distance_matrix.flatten().tolist()

model = LpProblem(name="TSP-Problem", sense=LpMinimize)

x_variables = []
for i in range(n):
for j in range(n):
x_variables.append(LpVariable(name=f"x_{i+1}_{j+1}", cat=LpBinary))
# The first value in the u_variables is a dummy value and is never used.
# Depending on the literature, the lower bound might be 2, and the upper bound might be n. It is actually equivalent.
u_variables =  + [
LpVariable(name=f"u_{i+2}", lowBound=1, upBound=n - 1, cat=LpInteger)
for i in range(n - 1)
]

model += lpDot(x_variables, distance_list)

# sum_{i=1, i \neq j}^{n} x_{ij} = 1 for all j
for j in range(1, n + 1):
sum_exp = 0
for i in range(1, n + 1):
if i != j:
idx_id = idx_2d_to_1d(i=i, j=j, width=n)
sum_exp += x_variables[idx_id]
model += sum_exp == 1

# sum_{j=1, j \neq i}^{n} x_{ji} = 1 for all i
for i in range(1, n + 1):
sum_exp = 0
for j in range(1, n + 1):
if j != i:
idx_id = idx_2d_to_1d(i=i, j=j, width=n)
sum_exp += x_variables[idx_id]
model += sum_exp == 1

# u_i - u_j + n x_{ij} \neq n - 1 for all 2 <= i, j <= n and i \neq j
for i in range(2, n + 1):
for j in range(2, n + 1):
if i != j:
idx_id = idx_2d_to_1d(i=j, j=i, width=n)
model += u_variables[i - 1] - u_variables[
j - 1] + n * x_variables[idx_id] <= n - 1

# print(model)
status = model.solve(solver=GLPK(msg=False))

assert status == True, "Linear programming solver did not solve the problem successfully."

# Collect TSP solution
non_zero_idx = []
for variable in model.variables():
if variable.name.startswith("x_") and variable.value() == 1:
_, i, j = variable.name.split("_")
i = int(i)
j = int(j)
non_zero_idx.append((i, j))
assert len(non_zero_idx) == n

tsp_solution = assemble_tsp_solution(non_zero_idx=non_zero_idx)

optimal_cost = model.objective.value()

return tsp_solution, optimal_cost

def create_random_symmetric_distance_matrix(n: int) -> np.ndarray:
def euclidean_distance(node_1: Tuple[float, float],
node_2: Tuple[float, float]) -> float:

return math.sqrt((node_1 - node_2)**2 +
(node_1 - node_2)**2)

nodes = [(random.uniform(0, 1), random.uniform(0, 1)) for _ in range(n)]

distance_matrix = np.zeros((n, n))
for i in range(n):
for j in range(n):
distance = euclidean_distance(nodes[i], nodes[j])
distance_matrix[i][j] = distance
distance_matrix[j][i] = distance

return distance_matrix

def main():

random_seed = 0
n = 10

random.seed(random_seed)

# distance_matrix = np.array(
#     [[  0,   1, 1, 1.5],
#     [  1,   0, 1.5, 1],
#     [  1, 1.5,   0, 1],
#     [1.5,   1,   1, 0]]
# )

distance_matrix = create_random_symmetric_distance_matrix(n=n)
print("-" * 75)
print("Distance Matrix (Symmetric):")
print(distance_matrix)

start = timer()
tsp_mtz_integer_linear_programming_solution, tsp_mtz_integer_linear_programming_optimal_cost = tsp_mtz_integer_linear_programming_solver(
distance_matrix=distance_matrix)
end = timer()
print("-" * 75)
print(
f"Integer Linear Programming Solver Time Elapsed: {timedelta(seconds=end-start)}"
)
print("Integer Linear Programming Solver Solution:")
print(tsp_mtz_integer_linear_programming_solution,
tsp_mtz_integer_linear_programming_optimal_cost)

start = timer()
tsp_brute_force_solution, tsp_brute_force_optimal_cost = tsp_brute_force_solver(
distance_matrix=distance_matrix)
end = timer()
print("-" * 75)
print(f"Brute Force Solver Time Elapsed: {timedelta(seconds=end-start)}")
print("Brute Force Solver Solution:")
print(tsp_brute_force_solution, tsp_brute_force_optimal_cost)

assert verify_symmetric_tsp_solution_equality(
tsp_solution_1=tsp_mtz_integer_linear_programming_solution,
tsp_solution_2=tsp_brute_force_solution)
assert np.isclose(tsp_mtz_integer_linear_programming_optimal_cost,
tsp_brute_force_optimal_cost)

if __name__ == "__main__":

main()



#### TSP Problem Solver Execution

Running the following program will print out the solution for the TSP problem in the terminal.