What?

A way to add additional edges to a graph based on the downstream task performance as well as a bunch of interesting connections between different areas of research (Finite State Automata, POMDPs and Successor Representations).

Why?

It's not straightforward, how to construct a graph from some relational data (Yes! And sometimes, the default approach can be suboptimal). Automation is good, let's automate this. Pun intended!

How?

source: original paper

source: original paper

Main idea:

<aside> 💡 We would like our agent to be powerful enough to extract useful information from this POMDP, but simple enough that we can efficiently compute and differentiate through the learned trajectories.

</aside>

Since the authors care about program analysis and regular languages, they use finite-state automata.

To get the adjacency matrix, we need to compute the probabilities of each of the final action $a_T$ in final node $n_T$:

$$ p(a_T, n_T|n_0, \pi_\theta) = \big[\sum_{i\geq0}{HQ^i_{n_0}\delta_{n_0}}\big]{(a_T, n_T)} = H{(a_T, n_T), :}(I-Q_{n_0})^{-1}\delta_{n_0}, $$