Tasks for the PMs

Coding task 1 (nn implementation):

https://colab.research.google.com/drive/10k2Pgyi9K-wl7qOOsGvKyY3P8lYN7gKH?usp=sharing

Article reading task 2:

Part 1: A Brief Introduction to RL

Part 2: Introducing the Markov Process

Part 3: Markov Decision Process (MDP)

Part 4: Optimal Policy Search with MDP

Part 5: Monte-Carlo and Temporal-Difference Learning

Part 6: TD(λ) & Q-learning(Use the below embed for reference

https://amreis.github.io/ml/reinf-learn/2017/11/02/reinforcement-learning-eligibility-traces.html

Part 7: A Brief Introduction to Deep Q Networks

Resources:

minigrid - https://minigrid.farama.org/environments/minigrid/CrossingEnv/