Travelling Salesman Problem

Introduction

The classical travelling salesman problem (TSP) asks the following question: “Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?” It is a classical NP-hard problem in combinatorial optimization, important in theoretical computer science and operations research.

It turns out that it could be formulated as a linear programming problem. In this blog post, I would like to implement a linear programming solver for the TSP problem.

Travelling Salesman Problem

Miller-Tucker-Zemlin (MTZ) Integer Linear Programming Formulation

There are two notable TSP integer linear programming formulations, Miller-Tucker-Zemlin (MTZ) formulation and the Dantzig-Fulkerson-Johnson (DFJ) formulation. We will be focusing on the MTZ formulation, as it is easier to program using the existing integer linear programming library.

Suppose we have $n$ cities, from $1$ to $n$. We define a binary square matrix $X \in \{0,1\}^{n \times n}$, where

$$
x_{ij} =
\begin{cases}
1 & \text{if the path goes from city $i$ to city $j$}\\
0 & \text{otherwise}\\
\end{cases} \\
$$

and a distance square matrix $C \in \mathbb{R}^{n \times n}$, where $c_{ij}$ is the distance from city $i$ to city $j$.

It is trivial to construct the integer linear programming optimization problem with the following constraints for the TSP problem.

$$
\DeclareMathOperator*{\argmin}{argmin}
\DeclareMathOperator*{\argmax}{argmax}
\argmin_{x_{ij}, \forall i,j \in [1,n]} \sum_{i=1}^{n} \sum_{j \neq i, j=1}^{n} c_{ij} x_{ij}
$$

subject to

$$
\begin{gather}
x_{ij} \in {0,1} \quad \forall i,j \in [1,n] \\
\sum_{i=1, i \neq j}^{n} x_{ij} = 1 \quad \forall j \in [1,n] \\
\sum_{j=1, j \neq i}^{n} x_{ij} = 1 \quad \forall i \in [1,n] \\
\end{gather}
$$

However, with these constraints, we will likely have subtours in our solution, which violates the TSP setting that the travelling salesman has to return to the original city. Subtours are just the incorrect TSP solution which contains several directed cycles. For example, suppose we have six cities, $1, 2, 3, 4, 5, 6$, without additional constraints, $x_{1,2}=1, x_{2,3}=1, x_{3,1}=1, x_{4,5}=1, x_{5,6}=1, x_{6,4}=1$ ($1 \rightarrow 2 \rightarrow 3 \rightarrow 1$, $4 \rightarrow 5 \rightarrow 6 \rightarrow 4$) are one of the valid solutions for optimization but they are incorrect for the TSP settings.

To prevent subtours, Miller, Tucker, and Zemlin introduces additional auxiliary integer variables $u_2, u_3, \cdots, u_n$ and additional constraints.

$$
\begin{gather}
u_i \in \mathbb{Z} \quad \forall i \in [2,n] \\
1 \leq u_i \leq n - 1 \quad 2 \leq i \leq n \\
u_i - u_j + n x_{ij} \leq n - 1 \quad 2 \leq i \neq j \leq n \\
\end{gather}
$$

With the constraints, if there are more than one directed cycles, at least one cycle will not contain node $1$ and the inequality of the $u$ variables will not be satisfied. For the previous incorrect TSP solution, we will have $u_4 < u_5 < u_6 < u_4$.

Also notice that linear function is both convex and concave (please show a simple proof on your own), which implies that every local optimum for the linear function is a global optimum. This means that the optimum solution from linear programming solver for the TSP problem is global optimum.

Install Dependencies

We will install and use pulp and glpk linear programming solvers to solve the TSP problems.

1
2
3
4
sudo apt-get update
sudo apt-get install glpk-utils

pip install pulp==2.4

TSP Integer Linear Programming Solver Python Implementation

We implemented a brute force TSP solver and an integer linear programming TSP solver, and verified the correctness of the two solvers on symmetric TSP problems. The two solvers could also be used for asymmetric TSP problems.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
from itertools import permutations
import random
import math
from timeit import default_timer as timer
from datetime import timedelta
from typing import List, Tuple, Union, Callable
import numpy as np
from pulp import LpMaximize, LpMinimize, LpProblem, LpStatus, lpSum, LpVariable, LpBinary, LpInteger, lpDot
from pulp import GLPK


def idx_2d_to_1d(i: int, j: int, width: int) -> int:

# i, j are in 1-indexed
# The returned index are 0 indexed.

return (i - 1) * width + (j - 1)


def assemble_tsp_solution(non_zero_idx: List[Tuple[int, int]]) -> List[int]:

# The TSP problem ensures that each city is visited exactly once.

# For example,
# [(2, 4), (3, 1), (4, 3), (1, 2)] -> [1, 2, 4, 3]
# Set the starting node to be 1 arbitrarily.

n = len(non_zero_idx)
tsp_solution = [1]

src_tgt_map = {}
for idx in non_zero_idx:
src_tgt_map[idx[0]] = idx[1]

src = 1
while len(tsp_solution) < n:
tgt = src_tgt_map[src]
tsp_solution.append(tgt)
src = tgt
if len(tsp_solution) == n:
assert src_tgt_map[src] == 1

return tsp_solution


def verify_symmetric_tsp_solution_equality(tsp_solution_1: List[int],
tsp_solution_2: List[int]) -> bool:

return (tsp_solution_1[:1] + tsp_solution_1[:0:-1]
) == tsp_solution_2 or tsp_solution_1 == tsp_solution_2


def tsp_brute_force_solver(
distance_matrix: np.ndarray) -> Tuple[List[int], float]:

assert distance_matrix.ndim == 2
assert distance_matrix.shape[0] == distance_matrix.shape[1]

n = distance_matrix.shape[0]

perms = permutations(range(2, n + 1))

tsp_solution = None
optimal_cost = float("inf")

for perm in perms:
perm = [1] + list(perm)
cost = 0
for i in range(n):
j = (i + 1) % n
cost += distance_matrix[perm[i] - 1][perm[j] - 1]
if cost < optimal_cost:
optimal_cost = cost
tsp_solution = perm

return tsp_solution, optimal_cost


def tsp_mtz_integer_linear_programming_solver(
distance_matrix: np.ndarray) -> Tuple[List[int], float]:

# TSP MTZ formulation
# https://en.wikipedia.org/wiki/Travelling_salesman_problem#Miller%E2%80%93Tucker%E2%80%93Zemlin_formulation

assert distance_matrix.ndim == 2
assert distance_matrix.shape[0] == distance_matrix.shape[1]

n = distance_matrix.shape[0]

distance_list = distance_matrix.flatten().tolist()

model = LpProblem(name="TSP-Problem", sense=LpMinimize)

x_variables = []
for i in range(n):
for j in range(n):
x_variables.append(LpVariable(name=f"x_{i+1}_{j+1}", cat=LpBinary))
# The first value in the u_variables is a dummy value and is never used.
# Depending on the literature, the lower bound might be 2, and the upper bound might be n. It is actually equivalent.
u_variables = [1] + [
LpVariable(name=f"u_{i+2}", lowBound=1, upBound=n - 1, cat=LpInteger)
for i in range(n - 1)
]

# Add objectives.
model += lpDot(x_variables, distance_list)

# Add constraints.
# sum_{i=1, i \neq j}^{n} x_{ij} = 1 for all j
for j in range(1, n + 1):
sum_exp = 0
for i in range(1, n + 1):
if i != j:
idx_id = idx_2d_to_1d(i=i, j=j, width=n)
sum_exp += x_variables[idx_id]
model += sum_exp == 1

# sum_{j=1, j \neq i}^{n} x_{ji} = 1 for all i
for i in range(1, n + 1):
sum_exp = 0
for j in range(1, n + 1):
if j != i:
idx_id = idx_2d_to_1d(i=i, j=j, width=n)
sum_exp += x_variables[idx_id]
model += sum_exp == 1

# u_i - u_j + n x_{ij} \neq n - 1 for all 2 <= i, j <= n and i \neq j
for i in range(2, n + 1):
for j in range(2, n + 1):
if i != j:
idx_id = idx_2d_to_1d(i=j, j=i, width=n)
model += u_variables[i - 1] - u_variables[
j - 1] + n * x_variables[idx_id] <= n - 1

status = model.solve(solver=GLPK(msg=False))

assert status == True, "Linear programming solver did not solve the problem successfully."

# Collect TSP solution
non_zero_idx = []
for variable in model.variables():
if variable.name.startswith("x_") and variable.value() == 1:
_, i, j = variable.name.split("_")
i = int(i)
j = int(j)
non_zero_idx.append((i, j))
assert len(non_zero_idx) == n

tsp_solution = assemble_tsp_solution(non_zero_idx=non_zero_idx)

optimal_cost = model.objective.value()

return tsp_solution, optimal_cost


def create_random_symmetric_distance_matrix(n: int) -> np.ndarray:
def euclidean_distance(node_1: Tuple[float, float],
node_2: Tuple[float, float]) -> float:

return math.sqrt((node_1[0] - node_2[0])**2 +
(node_1[1] - node_2[1])**2)

nodes = [(random.uniform(0, 1), random.uniform(0, 1)) for _ in range(n)]

distance_matrix = np.zeros((n, n))
for i in range(n):
for j in range(n):
distance = euclidean_distance(nodes[i], nodes[j])
distance_matrix[i][j] = distance
distance_matrix[j][i] = distance

return distance_matrix


def main():

random_seed = 0
n = 10

random.seed(random_seed)

# distance_matrix = np.array(
# [[ 0, 1, 1, 1.5],
# [ 1, 0, 1.5, 1],
# [ 1, 1.5, 0, 1],
# [1.5, 1, 1, 0]]
# )

distance_matrix = create_random_symmetric_distance_matrix(n=n)
print("-" * 75)
print("Distance Matrix (Symmetric):")
print(distance_matrix)

start = timer()
tsp_mtz_integer_linear_programming_solution, tsp_mtz_integer_linear_programming_optimal_cost = tsp_mtz_integer_linear_programming_solver(
distance_matrix=distance_matrix)
end = timer()
print("-" * 75)
print(
f"Integer Linear Programming Solver Time Elapsed: {timedelta(seconds=end-start)}"
)
print("Integer Linear Programming Solver Solution:")
print(tsp_mtz_integer_linear_programming_solution,
tsp_mtz_integer_linear_programming_optimal_cost)

start = timer()
tsp_brute_force_solution, tsp_brute_force_optimal_cost = tsp_brute_force_solver(
distance_matrix=distance_matrix)
end = timer()
print("-" * 75)
print(f"Brute Force Solver Time Elapsed: {timedelta(seconds=end-start)}")
print("Brute Force Solver Solution:")
print(tsp_brute_force_solution, tsp_brute_force_optimal_cost)

assert verify_symmetric_tsp_solution_equality(
tsp_solution_1=tsp_mtz_integer_linear_programming_solution,
tsp_solution_2=tsp_brute_force_solution)
assert np.isclose(tsp_mtz_integer_linear_programming_optimal_cost,
tsp_brute_force_optimal_cost)


if __name__ == "__main__":

main()

TSP Problem Solver Execution

Running the following program will print out the solution for the TSP problem in the terminal.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
$ python tsp.py 
---------------------------------------------------------------------------
Distance Matrix (Symmetric):
[[0. 0.65474242 0.48539707 0.45866571 0.40714944 0.26115321
0.56258812 0.55552086 0.23412879 0.14821244]
[0.65474242 0. 0.17189572 0.36593013 0.32926671 0.54598486
0.51589165 0.19797614 0.87365771 0.75205935]
[0.48539707 0.17189572 0. 0.29085421 0.18178614 0.4091835
0.41922677 0.18792852 0.70192003 0.58017765]
[0.45866571 0.36593013 0.29085421 0. 0.41570622 0.23665499
0.67580554 0.17365329 0.69104706 0.59943568]
[0.40714944 0.32926671 0.18178614 0.41570622 0. 0.43863303
0.26011635 0.36180871 0.58918709 0.4614387 ]
[0.26115321 0.54598486 0.4091835 0.23665499 0.43863303 0.
0.67474465 0.38543385 0.47810141 0.40935704]
[0.56258812 0.51589165 0.41922677 0.67580554 0.26011635 0.67474465
0. 0.6071072 0.66767467 0.54827597]
[0.55552086 0.19797614 0.18792852 0.17365329 0.36180871 0.38543385
0.6071072 0. 0.78812019 0.67931288]
[0.23412879 0.87365771 0.70192003 0.69104706 0.58918709 0.47810141
0.66767467 0.78812019 0. 0.12808409]
[0.14821244 0.75205935 0.58017765 0.59943568 0.4614387 0.40935704
0.54827597 0.67931288 0.12808409 0. ]]
---------------------------------------------------------------------------
Integer Linear Programming Solver Time Elapsed: 0:00:00.024176
Integer Linear Programming Solver Solution:
[1, 9, 10, 7, 5, 3, 2, 8, 4, 6] 2.3937246803810788
---------------------------------------------------------------------------
Brute Force Solver Time Elapsed: 0:00:02.765906
Brute Force Solver Solution:
[1, 6, 4, 8, 2, 3, 5, 7, 10, 9] 2.3937246803810788

Conclusion

Although the linear programming solver is much faster ($100\times$) than the brute force solver for the TSP problems in Python when the number of nodes is small, the linear programming solver will still hang as the number of nodes becomes larger ($\geq 23$) because the number of variables and constraints grow dramatically.

Reference

Author

Lei Mao

Posted on

07-20-2021

Updated on

07-20-2021

Licensed under


Comments