A way to add additional edges to a graph based on the downstream task performance as well as a bunch of interesting connections between different areas of research (Finite State Automata, POMDPs and Successor Representations).
It's not straightforward, how to construct a graph from some relational data (Yes! And sometimes, the default approach can be suboptimal). Automation is good, let's automate this. Pun intended!
source: original paper
Main idea:
<aside> 💡 We would like our agent to be powerful enough to extract useful information from this POMDP, but simple enough that we can efficiently compute and differentiate through the learned trajectories.
</aside>
Since the authors care about program analysis and regular languages, they use finite-state automata.
To get the adjacency matrix, we need to compute the probabilities of each of the final action $a_T$ in final node $n_T$:
$$ p(a_T, n_T|n_0, \pi_\theta) = \big[\sum_{i\geq0}{HQ^i_{n_0}\delta_{n_0}}\big]{(a_T, n_T)} = H{(a_T, n_T), :}(I-Q_{n_0})^{-1}\delta_{n_0}, $$