顺序阅读代码:
- scripts/run_hw.py (you should read this file, but you don’t need to edit it)
- infrastructure/rl_trainer.py
- agents/bc_agent.py (another read-only file)
- policies/MLP_policy.py
- infrastructure/replay_buffer.py
- infrastructure/utils.py
- infrastructure/pytorch_utils.py
首先按顺序看run_hw1.py 可以从main()看,
- 添加参数
- do_dagger 是否使用专家数据
- 使用logging的目录等
- 重点 建立BC_Trainer
- 重点 运行训练
BC_Trainer
- 导入参数
- 构建BCAgent
- 构建RL的训练对象