BASIC DEFINITIONS
$Integers \ mod \ N$ is the set of integers $\{0, 1, \dots, N-1\}$ denoted as $\mathbb{Z}_N$.
Two integers $s$ & $t$ are $Equivalent \ mod \ N$ if $N$ divides both $s$ & $t$ with the same remainder, or in other words:
$s \ mod \ N = t \ mod \ N$
the above representation, can then be written in short hand as:
$s = t\ (mod \ N)$
note that the function presented later in brackets now applies to the whole equation
Any integer $k$ can be $Reduced \ mod \ N$ by taking the Remainder $r$ after dividing $k$ by $N$. This is represented as:
$r = k \ mod \ N$
Further,
$xy \ mod N = (x \ mod \ N)(y \ mod \ N)\ mod \ N$
Inverse of $x \ mod \ N$:
For integers $x,y \in \mathbb{Z}_N$, if there exists a pair, such that:
$xy = yx = 1 \ (mod\ N)$
Then, y is unique for a given $x$ and is called the Inverse of $x \ mod \ N$.
Given, two positive integers $x$ & $N$, with no common factors such that $x < N$. The ORDER of $x$ is the smallest positive integer $r$, such that:
$\boxed{x^r =1 \ (mod \ N)}$
Iff $p$ is prime, then $\mathbb{Z}_p = \{0, 1, \dots, p-1\}$ along with the operations $addition$, $mod \ p$ & $multiplication \ mod \ p$ is a finite field ($F_p$) / Galois field $(GF(p))$.
For a given integer $N$, $\mathbb{Z}_N^*$ denotes the set of numbers $x$ in $\mathbb{Z}_N$, where $x$ & $N$ are coprime, i.e.
$gcd(x, N) = 1$
THE QUANTUM ALGORITHM
Classical computers fail:
The Algorithm for order-finding is the Phase Estimation Algorithm applied on the unitary operator $U$, such that:
$U\ket{y} = \left \{ \begin{matrix}\ket{xy\ mod\ N} & 0 \le y \le N-1 \\ \ket{y} & N \le y \le 2^L-1\end{matrix}\right .$
where, $y \in \{0,1\}^{L}$ (here, $L = log(N)$). Or in simple words, all binary strings of length $L$.
also, $GCD(x, N) = 1$.
It’s a fact that, the transformation performed by the $U$ operator is reversible & hence $U$ is unitary.
Now, we have the order equation:
$x^r = 1 \ (mod \ N)$
so, we can ideally write:
$U\ket{y} = \ket{x y \ mod \ N}$
$U\ket{xy \ mod \ N} = \ket{x^2 y \ mod \ N}$
$\dots$
$U^r\ket{y} = \ket{x^ry \ mod \ N} = \ket{(x^r \ mod \ N)(y \ mod \ N)\ mod \ N}$
$= \ket{1(y \ mod \ N)\ mod \ N}$
$= \ket{y \ mod \ N}$
$=\ket{y}$ // if $y \lt N$
$\therefore U$ is an $r^{th}$ root of Identity Operation.
Further, the eigenstates(eigenvectors) $\{\ket{u_s}\}_{s = 0 : r-1}$ of $U$ are given by:
$\boxed{\ket{u_s} =\frac{1}{\sqrt r}\sum_{k=0}^{r-1}e^{\frac{-2\pi i s k}{r}}\ket{x^k \ mod\ N}}$
Applying $U$ on these states, we get:
$\boxed{U\ket{u_s} =\frac{1}{\sqrt r}\sum_{k=0}^{r-1}e^{\frac{-2\pi i s k}{r}}\ket{x^{k+1} \ mod\ N} = e^{\frac{2\pi i s}{r}}\ket{u_s}}$
Notice, that this now looks very similar to the Phase Estimation problem given by:
$U\ket{\psi} = e^{2\pi i\theta}\ket{\psi}$, where $\theta = \large \frac{s}{r}$ (for order finding problem)
Thus, we can now apply the phase estimation circuit on the following equation (for any $s \in \{0,1,\dots,r-1\}$):
$PE : \ket{0}\ket{u_s} \mapsto \ket{\tilde{\large\frac{s}{r}}}\ket{u_s}$

(approximate Phase is obtained, thus $r$ can be easily obtained)
Anomally : In order for the Phase Estimation circuit to work, we must give in the input $\ket{u_s}$, which can’t be found out without the value $r$ (take a look at the defintion earlier presented in the text)!
Resolution: Instead of preparing an individual eigenstate $\ket{u_s}$, we instead find it sufficient to make SUPERPOSITION that has each eigen state with a reasonable weight. Such a superposition can be prepared without the knowledge of $r$.
Superposition of the eigenstates:
Consider:
$\large\frac{1}{\sqrt{r}} \normalsize\sum_{s=0}^{r-1}\ket{u_s} = \large\frac{1}{\sqrt{r}} \normalsize\sum_{s=0}^{r-1}\frac{1}{\sqrt r}\sum_{k=0}^{r-1}e^{\frac{-2\pi i s k}{r}}\ket{x^k \ mod\ N}$
let’s, for the moment look only at the terms where $k=0$:
then, $\ket{x^k \ mod \ N} = \ket{1}$, as $k = 0\ (mod \ r)$.
Hence, giving us the following equation for $k=0$:
$\boxed{\large\frac{1}{\sqrt{r}} \normalsize\sum_{s=0}^{r-1}\frac{1}{\sqrt r}e^{\frac{-2\pi i s 0}{r}}\ket{1} = \large\frac{1}{r}\normalsize\sum_{s=0}^{r-1}\ket{1} = \large\frac{r}{r}\normalsize\ket{1} = \ket{1}}$
We have arrived at an important result! From the above equation:
The probability of measuring the state $\ket{1}$ is $|1|^2 = 1$!
For rest of the terms i.e. $k \ne 0$, the probability will thus be
$1-1 = 0$
This simply reduces the superposition equation to:
$\large\frac{1}{\sqrt{r}} \normalsize\sum_{s=0}^{r-1}\ket{u_s} = \ket{1}$
Applying Phase Estimation
We can now, instead of writing:
$PE : \ket{0}\ket{u_s} \mapsto \ket{\tilde{\large\frac{s}{r}}}\ket{u_s}$
Write the following using a superposition of the eigenvectors:
$PE : \ket{0}\otimes\ket{1} = \large\frac{1}{\sqrt{r}} \normalsize\sum_{s=0}^{r-1}\ket{0} \ket{u_s} \mapsto \large\frac{1}{\sqrt{r}} \sum_{s=0}^{r-1}\ket{\tilde{\large\frac{s}{r}}}\ket{u_s}$