P VS NP, NP Complete, NP Hard
Introduction
In computer science, the P VS NP is a famous unsolved theoretical computer science problem that has profound implications for many fields if a proof could be given. The P VS NP problem asks whether problems that can be verified in polynomial time can also be solved in polynomial time. The definitions of P, NP, the commonly seen NP-complete and NP-hard, as well as their relationships, could be confusing.
In this blog post, I would like to make a quick introduction to the P VS NP problem.
Deterministic VS Non-Deterministic Turing Machine
Turing machines are useful models of real computers and can be used for implementing any computer algorithm. The original P VS NP problem was proposed in the settings of deterministic and non-deterministic Turing machines.
Let’s check what deterministic and non-deterministic Turing machines are first.
Deterministic Turing Machine
In a deterministic Turing machine (DTM), the set of rules prescribes at most one action to be performed for any given situation. The input is said to be accepted by DTM if DTM ends up with an accept state.
Non-Deterministic Turing Machine
In contrast to a deterministic Turing machine, in a nondeterministic Turing machine (NTM) the set of rules may prescribe more than one action to be performed for any given situation. The input is said to be accepted by NTM if and only if at least one of the possible computational paths starting from that input puts the machine into an accept state.
Deterministic VS Non-Deterministic Turing Machine
The difference between DTM and NTM is that, the transition relation of DTM is a (one-to-one mapping) function rather than just a relation.
It’s also obvious that DTM is a special case of the NTM.
Simulating Non-Deterministic Turing Machine
It’s not difficult to imagine that NTM can be simulated using DTM. For example, if we simulate the NTM using a DTM that performs a breadth-first search on the NTM’s computation tree, until it finds a leaf of an accept state. Suppose the length of the shortest accept computation of the NTM is $d$, the computation complexity of simulating NTM using DTM breadth-first search is $O(b^d)$ where $b$ is the “branching factor” of the graph (the average out-degree).
Non-Deterministic Turing Machine for the Travelling Salesman Problem
There is a decision version of the travelling salesman problem. Given an input matrix of distances between $n$ cities, the problem is to determine if there is a route visiting all cities with total distance less than $k$.
Using a DTM to find the answer to this problem seems to be difficult and time consuming. But if someone proposed a route solution, a DTM can verify the solution to the problem in $O(n)$ time.
In contrast, a NTM can find such a route as follows:
- At each city it visits it will “guess” the next city to visit, until it has visited every vertex. If it gets stuck, it stops immediately.
- At the end it verifies that the route it has taken has cost less than $k$ in $O(n)$ time.
Thus, a NTM can solve the problem in $O(n)$ time.
Using a DTM breadth-first search to simulate the NTM in this case is no difference from using a DTM to check all the possible routes in a brute-force fashion, which takes $O(n \times n!)$ time.
In fact, because the DTM verification accept computation path is just one of the NTM solution accept computation paths, a DTM can verify the solution to the problem in $O(n)$ time if and only if a NTM can solve the problem in $O(n)$ time.
Now we come up with a question. Given a problem whose solution can be quickly verified by DTM, such as the decision version of the travelling salesman problem mentioned above, can it also be solved by DTM quickly? Even though I said “Using a DTM to find the answer to this problem seems to be difficult and time consuming”, I could not show any proof that a DTM that can solve such a problem quickly does not exist. This leads to the famous P VS NP problem.
P VS NP
Understanding the famous P VS NP problem is all about understanding the definitions. Let’s check the definitions of P, NP, NP-complete and NP-hard, and then understand the implications of P = NP and P ≠ NP.
P
In computational complexity theory, P is a fundamental complexity class that contains all decision problems that can be solved by a deterministic Turing machine using a polynomial amount of computation time, or polynomial time.
NP
In computational complexity theory, NP (nondeterministic polynomial time) is a complexity class used to classify decision problems. NP is the set of decision problems for which the problem instances, where the answer is “yes”, have proofs verifiable in polynomial time by a deterministic Turing machine, or alternatively the set of problems that can be solved in polynomial time by a nondeterministic Turing machine.
By definition, the decision version of the travelling salesman problem mentioned previously is a NP problem.
NP-Complete
In computational complexity theory, a problem in NP is NP-complete if every other problem in NP can be transformed (or reduced) into it in polynomial time. Because NP-complete problems are the “last steps” of solving all the NP problems, they are the hardest NP problems.
Because the problem transformation and the solution verification are separate steps, NP-complete problems verifies the solution in polynomial time, the transformation of other NP problems to the NP-complete the problems takes polynomial time, the sum of the time complexity for verifying the solution to a NP problem, if it were transformed to a NP-complete problem, is still polynomial time.
NP-Hard
In computational complexity theory, NP-hardness (non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally “at least as hard as the hardest problems in NP”. A more precise definition is: a problem H is NP-hard when every problem L in NP can be reduced in polynomial time to H.
The definition already suggests that NP-complete problems are NP-hard. But not every NP-hard problem is NP-complete, since the problem H is not restricted to be a NP problem. If the problem H takes asymptotically longer time to verify its solution using a DTM or solve using a NTM than a NP problem, H is thus not a NP problem, not to mention NP-complete. Therefore, we say NP-hard problems are “at least as hard as the hardest problems in NP”.
P VS NP Problem
The famous P VS NP problem asks whether problems that can be verified in polynomial time using a DTM can also be solved in polynomial time using a DTM. If the answer is yes, then we say P = NP. Otherwise, P ≠ NP. This problem has not been solved so far.
From the definitions we described above, there are some implications. For example, if any NP-complete problem can be solved in polynomial time by a DTM, then all the NP problems can be converted to this NP-complete problem in polynomial time and solved in polynomial time by a DTM, and consequently P = NP. There are some well-known NP-complete problems, such as the boolean satisfiability problem, the knapsack problem, and the decision version of the travelling salesman problem. However, as of today, none of the NP-complete problems we know so far can be solved in polynomial time using a DTM.
P VS NP VS NP-Complete VS NP-Hard Problem
Based on the definitions and implications of the P, NP, NP-Complete, and NP-hard, we could summarize the following Euler diagram for the scenarios of P = NP and P ≠ NP, respectively.
The left Euler diagram, which is widely believed, is easy to understand by the definitions.
The right Euler diagram is somewhat tricky to understand. What does P = NP ≃ NP-Complete mean? If P = NP, every NP-Complete problem becomes P. But assuming P = NP, is every problem in P or NP also NP-complete? The answer is no. There are NP problems which can be reduced to NP-complete problems while themselves are not NP-complete. It turns out that there are not so many such NP problem that can’t be reduced to NP-complete problems under P = NP. Only the trivial decision problems which always produce accept or reject, regardless of the inputs, are the NP problem that can’t be reduced to NP-complete problems under P = NP.
Proof
Let a problem B be in the class P, and B is the non-trivial problem that for some inputs the DTM will run into an accept state and for the some other inputs the DTM will run into a reject state. Without loss of generality, for a polynomial DTM that solves the problem B, let $w_{\text{in}}$ be the input that results in an accept state, and $w_{\text{out}}$ be the input that results in an reject state.
We would like to show that B is NP-complete.
Because we assumed P = NP, the problem B is also NP. Let A be an arbitrary problem from P or NP (A can be a trivial problem), and A can be solved by a DTM in polynomial time. We need to show that the problem A can be reduced to the problem B in polynomial time.
The polynomial time reduction function can just be:
Given an input $w$ to the problem A:
- Run the DTM to solve the problem A in polynomial time because A is in the class P.
- If the DTM results in an accept state, the reduction function outputs $w_{\text{in}}$ for the problem B which will also results in an accept state. If the DTM results in an reject state, the reduction function outputs $w_{\text{out}}$ for the problem B which will also results in a reject state.
This polynomial time reduction function seems weird because the mapping from the problem B to the problem A does not sound very logical. However, this polynomial time reduction function works.
The problem B cannot be trivial problems. Otherwise, non-trivial NP problems cannot be reduced to the problem B using the polynomial time reduction function mentioned above, because the trivial problem B either has no $w_{\text{in}}$ or no $w_{\text{out}}$.
Therefore, if P = NP, any problem in P or NP, except the trivial problems which are always be accepted by DTMs or always be rejected by DTMs, is NP-complete. We could conclude P = NP ≃ NP-Complete.
This concludes the proof. $\square$
References
P VS NP, NP Complete, NP Hard