I've built multiple AI agents, but understanding their design principles always felt fragmented. Fortunately, Stanford's CS329A changed that—it traces the evolution of agent architectures from first principles, showing how each innovation emerged from solving specific limitations. After reading all papers from its reading list, I've reorganized the key concepts here in a more narrative-driven format: from ReAct to Reflexion, from test-time scaling to train-time RL, from memory-as-text to memory-as-representation.
The simplest agent follows a three-step loop: Thought → Action → Observation. This is ReAct.
Here's how it works in practice. You give the agent a few-shot prompt showing examples of this loop. At each step:
Then the loop repeats until the task is done.
The limitation? ReAct is terrible at error correction. If it makes a mistake in step 3, the mistake will be carried over into the next loops. It has no mechanism to look back and think what if sth does not work, or whether to try sth else.
Side note: LangChain has a variant of ReAct called "plan-and-execute" where the loop is different—first the agent drafts a plan, then executes multiple steps, then gives all results back to the planner to decide whether to continue or declare completion. It's still forward-only though.
Reflexion fixes ReAct's biggest weakness by adding a feedback loop. Four components work together:
The workflow is as follows:
Actor uses short-term + long-term memory as context
↓
Takes action
↓
Evaluator judges the result
↓
If failure: Reflection model writes analysis → stored in long-term memory
↓
Next episode uses this reflection as context
Note that the Memory has two tiers: