PPO(Proximal Policy Optimization) 알고리즘
MA-POCA(Multi-Agent Posthumous Credit Assignment)알고리즘
Imitation Learning ( 모방학습 )
ML-Agents에서 제공하는 학습 방식
1. Solving Complex Tasks Using Curriculum Learning
2. Training Robust Agents using Environment Parameter Randomization
3. Training in Competitive Multi-Agent Environments with Self-Play
4. Training in Cooperative Multi-Agent Environments