Background
Markov transition matrices
- scalar discrete random variables with $K$ categories
- $x_t, x_{t-1} \in 1, \dots, K$ (one-hot encoded)
- Transition matrix $[Q_t]{ij} = q(x_t = j|x{t-1} = i)$
- $q(x_t|x_{t-1}) = \text{Cat}(x_t; p=x_{t-1}Q_t)$
EX)
$$
x_{t-1} = [1, 0, 0],\quad Q_t = \begin{bmatrix} 0.8, 0.1, 0.1 \\ 0.1, 0.8, 0.1\\0.1,0.1,0.8\end{bmatrix}
$$
$$
x_{t-1}Q_t = [1, 0, 0]\begin{bmatrix} 0.8, 0.1, 0.1 \\ 0.1, 0.8, 0.1\\0.1,0.1,0.8\end{bmatrix} = [0.8, 0.1, 0.1]
$$
Doubly stochastic matrix
What we want:
- Rows of $\bm{Q}_t$ must sum to one: conserve probability mass
- Rows of $\bar{\bm{Q}_t} = \bm{Q}_1\bm{Q}_2\dots\bm{Q}_t$ must converge to a stationary distribution
—> chose $\bm{Q}_t$ as increasing powers of a doubly stochastic base matrix $\bm{Q}$
—> 실제로 implement할 때에는 base matrix에 power로 설정하지는 않고, 기존 방식대로 noise scale을 조금씩 늘려가는 방식을 이용하지만, 이 경우에도 크게 문제 없이 성립.
- $\sum_{i=1}^K \bm{Q}_{i,:} = \mathbf{1}$
- $\sum_{j=1}^K \bm{Q}_{:,j} = \mathbf{1}$