Introduction
RSA (Rivest–Shamir–Adleman) algorithm is an asymmetric cryptographic algorithm that is widely used in the modern publickey cryptosystems. We have been hearing RSA algorithm all the time, but some of us actually did not know what it really is and how it works.
In this article, I will systematically discuss the theory behind the RSA algorithm. The theory guarantees that the cryptosystems built on the top of the RSA algorithm are relatively safe and hard to crack, which is fundamentally interesting.
Prerequisites
Euler’s Totient Function
In number theory, Euler’s totient function, also called Euler’s phi function, denoted as $\varphi(n)$, counts the positive integers up to a given integer n that are relatively prime to $n$. In other words, it is the number of integers $k$ in the range $1 \leq k \leq n$ for which the greatest common divisor $\gcd(n, k)$ is equal to 1.
Euler’s totient function is a multiplicative function, meaning that if two numbers $m$ and $n$ are relatively prime, then,
\[\varphi(mn) = \varphi(m)\varphi(n)\]If $k$ numbers, $\{m_1, m_2, \cdots, m_k\}$, are pairwise relatively prime, then
\[\varphi(\prod_{i=1}^{k}m_i) = \prod_{i=1}^{k} \varphi(m_i)\]A concrete proof of this property could be found here, which requires to use the Chinese remainder theorem.
When $n$ is prime number, according to the definition of prime, $\varphi(n) = n1$. If $m$ and $n$ are different prime numbers, because $m$ and $n$ are relatively prime, we have
\[\begin{aligned} \varphi(mn) &= \varphi(m)\varphi(n) \\ &= (m1)(n1) \end{aligned}\]Euler’s Theorem
If $m$ and $n$ are relatively prime, then,
\[m^{\varphi(n)} \equiv 1 \pmod n\]where $\varphi(n)$ is Euler’s totient function. This theorem is very famous, and there a couple of different proofs to it. One of the proofs could be found here.
Multiplicative Inverse Theorem
Let $n$ and $x$ be positive integers. Then $x$ has a multiplicative inverse modulo $n$ if and only if $\gcd(n, x) = 1$. Moreover, if it exists, then the multiplicative inverse is unique.
Equivalently, that is to say, let $n$ and $x$ be positive integers,
\[xy \equiv 1 \pmod n\]$y \bmod n$ exists if and only if $\gcd(n, x) = 1$, and $y \bmod n$ is unique.
Note that as long as the multiplicative inverse $y \bmod n$ exists, all the integers that has the same $y \bmod n$ satisfy $xy \equiv 1 \pmod n$. But there is only one such integer that is $0 \leq y \leq n1$. For instance, if there is a $y^\ast$ such that $xy^\ast \equiv 1 \pmod n$, then $xy^\ast  1 = kn$ for some integer $k$. Any other $y$ where $y=y^\ast + tn$, for any integer $t$, also satisfies $xy  1 = k^\prime n$ for some integer $k^\prime$ (This is easy to show). Therefore, we also have $xy \equiv 1 \pmod n$. Because there could be infinite number of $y$ which satisfies $xy \equiv 1 \pmod n$, we consider the multiplicative inverse to be $y \bmod n$ which is unique if it exists.
Proof
To prove for the sufficient condition and the uniqueness. There are $n$ possibilities of $y \pmod n$, $0, 1, 2, \cdots, n1$. Then the value of $xy$ could be $0x, 1x, 2x, \cdots, (n1)x$. We are going to show that $0x \bmod n, 1x \bmod n, 2x \bmod n, \cdots, (n1)x \bmod n$ are distinct if $\gcd(n, x) = 1$. Suppose there are two distinct integer $a, b$, $0 \leq a, b \leq n1$, and $ax \equiv bx \pmod n$. Then $(ab)x = kn$ for some integer $k$. Because $\gcd(n, x) = 1$, $ab = hn$ for some integer $h$. However, since $ab$ is in a range of $[n+1, n1]$ and $ab \neq 0$ because $a$ and $b$ are distinct, there is no integer $h$ which could satisfy $ab = hn$. Thus, $0x \bmod n, 1x \bmod n, 2x \bmod n, \cdots, (n1)x \bmod n$ have to be distinct if $\gcd(n, x) = 1$. Since there are $n$ possible values for $0x \bmod n, 1x \bmod n, 2x \bmod n, \cdots, (n1)x \bmod n$ which are distinct, there is only one of them which must be 1. Therefore, $y \bmod n$ exists if $\gcd(n, x) = 1$ and $y \bmod n$ is unique.
To prove for the necessary condition. Given nonnegative integer $y$ such that $xy \equiv 1 \pmod n$, we have $xy  1 = kn$ for some integer $k$. Suppose $\gcd(n, x) > 1$, we divide $\gcd(n, x)$ at both sides of the equation.
\[\begin{gather} \frac{xy  1}{\gcd(n, x)} = \frac{kn}{\gcd(n, x)} \\ \frac{xy}{\gcd(n, x)}  \frac{1}{\gcd(n, x)} = \frac{kn}{\gcd(n, x)} \end{gather}\]Because $\frac{xy}{\gcd(n, x)}$ and $\frac{kn}{\gcd(n, x)}$ are integers, but $\frac{1}{\gcd(n, x)}$ is not an integer because $\gcd(n, x) > 1$, there is no way for the equivalence. This raises a contradiction. Therefore, $\gcd(n, x) = 1$ if $y \bmod n$ exists.
This concludes the proof. $\square$
Lemma 1
If $m$ and $n$ are relatively prime, then,
\[m^{k\varphi(n)+1} \equiv m \pmod n\]where $\varphi(n)$ is Euler’s totient function, and $k$ could be any integer.
Proof
Using the compatibility with scaling in modular arithmetic properties, we multiply $m$ at both sides of $m^{\varphi(n)} \equiv 1 \pmod n$ from Euler’s theorem, we have
\[m^{\varphi(n)+1} \equiv m \pmod n\]We further multiply $m^{\varphi(n)}$ at the both sides of $m^{\varphi(n)+1} \equiv m \pmod n$, we have
\[m^{2\varphi(n)+1} \equiv m^{\varphi(n)+1} \pmod n\]By induction, we could show that
\[m^{k\varphi(n)+1} \equiv m \pmod n\]for any nonnegative integer $k$.
Similarly, we multiply $m^{\varphi(n)}$ at the both sides of $m^{\varphi(n)+1} \equiv m \pmod n$, we have
\[m \equiv m^{\varphi(n)+1} \pmod n\]By induction, we could show that
\[m \equiv m^{k\varphi(n)+1} \pmod n\]for any negative integer $k$.
This concludes the proof that the congruence is valid for any integer $k$. $\square$
RSA Algorithm
Basic Features of PublicKey Cryptosystems
The RSA algorithm is used as a typical publickey cryptosystem. Therefore, it has to match the four basic features for publickey cryptosystems. I am copying the descriptions for the features for the completeness of this article.
The encryption and decryption functions using the public key and private key in the RSA algorithm are denoted by $E$ and $D$, respectively. $M$ is used to represent the message to be encrypted and sent. The four basic features of a publickey cryptosystem, as well as the RSA algorithm, are:
 Decrypting an encrypted message gives you the original message.
 Encrypting a decrypted message gives you the original message.

$E$ and $D$ are easy to compute.

The publicity of $E$ does not compromise the secrecy of $D$.
RSA Basic Principle
A basic principle behind RSA is the observation that it is practical to find three very large positive integers $e$, $d$ and $n$ such that with modular exponentiation for all integers $m$ (with $0 \leq m < n$):
\[(m^e)^d \equiv m \pmod n\]Here, the tuple $(n, e)$ is usually called the public key for encryption, and the tuple $(n, d)$ is usually called the private key for decryption. $m$ is the message because you could always represent a message using an integer uniquely. If somehow the message is too long and $m$ exceeds $n$, we dissect the messages into trunks and encrypt separately.
RSA Key Generations
We would show how the $e$, $d$, and $n$ were generated in the RSA algorithm to satisfy the RSA basic principle.
In the RSA algorithm,
\[n = p \times q\]where $p$ and $q$ are some large distinct prime numbers.
Because of the Euler’s theorem in the prerequisite, if the message $m$ and $n$ are relatively prime, then
\[m^{\varphi(n)} \equiv 1 \pmod n\]where $\varphi(n)$ is Euler’s totient function.
There is an extremely rare case that when $m$ and $n$ are not relatively prime, that is only when $m = p$ or $m = q$, the decryption of the encrypted message would not recover the original content of the message. If we were that lucky, we would have cracked the RSA encryption system.
I am not sure people have set rules to eliminate this extremely rare corner case. If we really want to do so, in each encryption, we could encrypt the same message several times, say three times, using $E(m)$, $E(m+1)$, and $E(m+2)$. We also have the digital signature for each of the messages. If the messages were from the authentic author, there is no message content modification, and we did not hit $p$ and $q$ by chance, the three digital signatures should all pass the digital signature verifications. We would recover the three messages, $m$, $m+1$, and $m+2$, and further recover the three messages to the exact same message $m$, $m$ and $m$. However, if we somehow hit $p$ or $q$ by chance, some of the digital signatures would fail the verification. We just have to extract the message information from the messages that passed the digital signature verification. After all, the three messages contain the exact same information. According to the pigeonhole principle, the three distinct messages could not be all relatively prime to $p$ or $q$.
Based on the property of the Euler’s totient function in the prerequisite, computing the Euler’s totient function for the product of two distinct prime numbers is actually very easy.
\[\begin{aligned} \varphi(n) &= \varphi(pq) \\ &= \varphi(p)\varphi(q) \\ &= (p1)(q1) \end{aligned}\]Based on the Lemma 1 in the prerequisite, for any integer $k$,
\[m^{k\varphi(n)+1} \equiv m \pmod n\]We immediately found that, based on the RSA basic principle, $ed = k\varphi(n)+1$. Although $n$ is public, factorizing $n$ to $p$ and $q$ is almost impossible using an modern computer, computing $\varphi(n)$ using mathematical definition and the equation $\varphi(n) = (p1)(q1)$ are almost not possible neither. Therefore, releasing $e$ as the public key does not lead to the disclosure of $d$ easily.
Then the question becomes how to choose appropriate integers $e$ and $d$. It seems that $e$ and $d$ could be any values as long as the equivalence $ed = k\varphi(n)+1$ is satisfied for some integer $k$.
This is equivalent to say we need to satisfy
\[ed \equiv 1 \pmod {\varphi(n)}\]Based on the multiplicative inverse theorem in the prerequisite, as long as $\gcd(e,\varphi(n)) = 1$, there must be a unique $d$ which satisfies the above congruence. Getting such $e$ might not be hard. Although $e$ does not have to be prime, for our convenience, we could simply select a prime number from a corpus of prime numbers and verify whether $\gcd(e,\varphi(n)) = 1$ since verifying relatively primeness is easy if one of the numbers are known to be prime. A typical $e$ generally used could be 65537, which is a prime number.
Once $e$ is determined, $d \bmod \varphi(n)$ could be determined using the Extend Euclidean algorithm which takes $O((\log\varphi(n))^2)$ to run. Note that it is not necessary to make $d$ infinitely large large to make the private key $d$ less suspectable to cracking. Using any $d$s that have the same remainder $d \bmod \varphi(n)$ decrypt the encrypted message exactly the same.
With such $e$ and $d$, the RSA basic principle is satisfied.
Message Encryption and Decryption
With the orchestrated $e$ and $d$, and the RSA basic principle, it is not hard to find that the encryption function $E$ is
\[c \equiv m^e \pmod n\]where $c$ is the encrypted message. In practice,
\[c = m^e \bmod n\]The decryption function $D$ is
\[c^d \equiv (m^e)^d \equiv m \pmod n\]In the above congruences, the first congruence is due to the compatibility with exponentiation in modular arithmetic properties(https://en.wikipedia.org/wiki/Modular_arithmetic#Properties), the second congruence is because of the RSA basic principle. similarly, in practice,
\[m = c^d \bmod n\]Without any doubt, using such encryption and decryption, the first feature of the publickey cryptosystem, decrypting an encrypted message gives you the original message, is satisfied.
If we swap the positions of $e$ and $d$ in the RSA basic principle, surprisingly (or not?), the congruences and equivalences still hold, meaning that the second feature of the publickey cryptosystem, encrypting an decrypted message gives you the original message, also holds.
How about the third feature, $E$ and $D$ are easy to compute? $E$ and $D$ both involve exponential computations which would be extremely slow and memory consuming (actually no memory so far could fit a number with not very large exponents) using trivial algorithms if the exponents $e$ and $d$ are large. However, given $e$, it is meaningless for $d$ to be “infinitely” large to satisfy the congruence $ed \equiv 1 \pmod {\varphi(n)}$. Note that only $e$ and $d \bmod \varphi(n)$ are the actual keys. In addition, there are actually fast modular exponentiation algorithms which take $O(\log e)$ or $O(\log d)$ (i.e., $O(\log (d \bmod \varphi(n)))$) to run and are memory saving. We would not elaborate on it here, and would take it for granted that the third feature is satisfied.
The fourth feature of the publickey cryptosystems has also been satisfied. We have seen the related information in the earlier sections and will see more in the following sections.
Cracking the RSA Cryptosystem
Modern Computer
Cracking the RSA encryption system using brute force is not practically feasible. If the private key $d$ is large, it would take an extremely large number of iterations to guess the correct private key $d$, not even mention in each iteration, since you usually don’t know the message content, there is hardly any way to verify whether the decrypted message using the guessed private key $d$ is the original message that you have never seen.
A better way is to factorize the public $n$. If somehow you know the value of $\varphi(n)$, with the public key $e$, you would derive the value of $d \bmod \varphi(n)$. Remember what is important in the RSA algorithm is $d \bmod \varphi(n)$ instead of the actual value of $d$.
Quantum Computer
The ordinary algorithm to do integer factorization takes subexponential time according to the Wikipedia. This is the fundamental reason which makes the RSA cryptosystem so reliable.
Quantum computer, however, is good at integer factorization. Using Shor’s algorithm, quantum computer does integer factorization in polynomial time, which makes cracking the RSA cryptosystem possible.