authored by [email protected] (prev. @Baidu Silicon Valley AI Lab), Oct 23, 2017
Inference in a nutshell:
Computing posterior distribution
$$ p(\vec{h}|\vec{v}) $$
where v is observed data, and h is latent.
Here, we (mostly) see inference as optimization by augmenting p with a distribution q on latent h —
Book Chapter:
Variational Inference and Learning
Marginal Likelihood (what we are trying to bound in the very beginning)
Marginal likelihood - Wikipedia
E-M intro (behind paywall :( )
What is the expectation maximization algorithm?
Auto-encoding variational bayes
https://www.youtube.com/watch?v=rjZL7aguLAs
(unevaluated, but interesting) Adversarially learned inference
[1606.00704] Adversarially Learned Inference
New paper argung that SGD implicitly performs variational inference