Background

PageRank is an algorithm developed by Google to rank web pages based on their importance in the hyperlink structure of the web. The core intuition is that a node's importance is determined recursively by the importance of the nodes linking to it. That is, a node is important if it is linked to by other important nodes.

Let the graph be represented by its transition matrix $M$, where $M_{i,j}$ is the probability of moving from node $j$ to node $i$. This is done by normalizing each column of the adjacency matrix $A$ so that the total probability of leaving a node sums to $1$. Specifically, if node $j$ has $d_j$ outgoing links (i.e., $d_j = \sum_{i} A_{i,j}$), we define:

$$ M_{i,j} = \begin{cases}\frac{1}{d_j} & \text{if } A_{i,j} = 1 \\0 & \text{otherwise}\end{cases}. $$

Intuitively, this means that from node $j$, a random walker chooses uniformly at random one of the outgoing links to follow. If node $j$ has 3 outgoing links, each target gets a probability of $\frac{1}{3}.$
The entry $M^\top_{i,j}$ gives the probability that a random walker currently at node $j$ came from node $i$ in the previous step.

The PageRank vector $\mathbf{r} \in \mathbb{R}^N$ is defined as the stationary distribution of this stochastic process, and it satisfies the fixed-point equation:

$$ \mathbf{r} = d M^\top \mathbf{r} + \frac{1 - d}{N} \mathbf{1} $$

Here:

$d \in (0,1)$ is the damping factor (typically set to 0.85),
$M$ is the transition matrix.
$N$ is the number of nodes in the graph,
$\mathbf{1}$ is the vector of all ones.

The solution is typically found by an iterative algorithm called power iteration:

$$ \mathbf{r}^{(t+1)} = d M^\top \mathbf{r}^{(t)} + \frac{1 - d}{N} \mathbf{1} $$

This process converges under mild conditions and provides a ranking over nodes that reflects their centrality in the graph structure.

Special Properties of DAGs

A Directed Acyclic Graph (DAG) has no cycles. Therefore, nodes can be arranged in a topological order $v_1, v_2, \dots, v_N$ such that every edge satisfies $i < j$. In this ordering, the adjacency matrix becomes strictly upper triangular, and the corresponding matrix $M$ is also upper triangular. Consequently, $M^\top$, which is used in PageRank iteration, becomes lower triangular.