Summer’21 Final Schedule

Please give us your feedback on each/any for the days or sessions!

Feedback survey: https://docs.google.com/forms/d/e/1FAIpQLSfvMZDKKReEYsc0qAeT8ZC8YapZNvkRuDo4Y-DDo7GDekfLZA/viewform

Day 1 (9:30am - 7:00pm)

[9:30-11] Session 1 (1.5 hours): Intro, review of basic RL, overview, Why RL?
- What is and isn't RL?
- Markov Decision Process, Optimal Bellman Equation
- Exploration v/s Exploitation
- Common terminology
  - On Policy v/s Off Policy
  - Analytical Models v/s Learnt Models v/s Model Free
- Slides
  
  Introduction_PA_slides.pdf
  
  Session-1-1-Introduction-2021-06-28.pdf
[11-11:30] break (0.5 hours)
[11:30-12:30] Session 1b: Intro (cont...)
[12:30-13:30] lunch (1 hour)
[13:30-14:00] Coffee Chat (30 mins, possibly on Gather.town)
[14:00-16:00] Session 2b: State-of-the Art RL Algorithms
- Value Based Methods
- Advanced value based methods: DQN / RAINBOW
- Policy Based Methods: REINFORCE
- Actor Critic Methods: A2C, A3C, PPO, TD3, SAC
- Slides
  
  Session-1-2-State-of-the-Art-2021-06-28b.pdf
[16:00-16:30] break (0.5 hours)
[16:30-17:00] Session 4: Perspectives from Practical RL
- The Reward Hacking Problem
- Slides
  
  Reward_Design_PE_RL_Summer21_Pulkit.pdf
[17:00-18:00] Session 5: Applications to RL (1 hour)
- Recommendation systems
- Balloon Localization
- Manipulation
- Urban Planning
- Slides
  
  Session-1-5-Applications-2021-06-28.pdf
  
  Square_CB
[18:00-19:00] Reception (1 hour) (On Gather.town)