Please give us your feedback on each/any for the days or sessions!
[9:30-11] Session 1 (1.5 hours): Intro, review of basic RL, overview, Why RL?
What is and isn't RL?
Markov Decision Process, Optimal Bellman Equation
Exploration v/s Exploitation
Common terminology
Slides
[11-11:30] break (0.5 hours)
[11:30-12:30] Session 1b: Intro (cont...)
[12:30-13:30] lunch (1 hour)
[13:30-14:00] Coffee Chat (30 mins, possibly on Gather.town)
[14:00-16:00] Session 2b: State-of-the Art RL Algorithms
Value Based Methods
Advanced value based methods: DQN / RAINBOW
Policy Based Methods: REINFORCE
Actor Critic Methods: A2C, A3C, PPO, TD3, SAC
Slides
[16:00-16:30] break (0.5 hours)
[16:30-17:00] Session 4: Perspectives from Practical RL
The Reward Hacking Problem
Slides
[17:00-18:00] Session 5: Applications to RL (1 hour)
Recommendation systems
Balloon Localization
Manipulation
Urban Planning
Slides
[18:00-19:00] Reception (1 hour) (On Gather.town)
[9:30-11:00] Session 1: Using Models and Expert Supervision to Improve Sample Efficiency (1 hour)
Learning from Experts
Tradeoffs between Model Based and Model Free Algorithms
Slides
[11-11:30] break (0.5 hours)
[11:30-1p] Session 2: Inverse Model, Goal Based and Hierarchical RL
Inverse Models
Goal Conditioned RL
Hindsight Experience Replay
Approaches to Hierarchical RL
Slides
[1p-1:30p] Discuss Problem Clinic over Coffee (30 mins)