Summaries, Math, and Definitions

Introduction


*I strongly recommend reading from Slides 21 onwards, goes to sampling methods:

The beauty of the variational approach is that we do not need to specify a specific parametric form for q. We specify how it should factorize, but then the optimization problem determines the optimal probability distribution within those factorization constraints. For discrete latent variables, this just means that we use traditional optimization techniques to optimize a finite number of variables describing the q distribution. For continuous latent variables, this means that we use a branch of mathematics called calculus of variations to perform optimization over a space of functions and actually determine which function should be used to represent q.

Challenges


Discrete Latent Variables


Apologies if this part wasn't very clear in the talk: the math eventually leads to L being arithemetically computable. In the end, we can see sparse coding as an iterative autoencoder.