Retrieval-Augmented Generation (RAG) is an approach where a language model looks things up in an external knowledge source while answering, instead of relying only on what was in its training data.

At a high level, RAG does two things:

  1. Retrieves relevant information (documents, notes, web pages, PDFs) from a database or index.
  2. Generates an answer using both the user’s question and the retrieved information.

This makes answers more:


1. Intuitive Picture of RAG

1.1 RAG as an "Open-Book Exam" for LLMs

A normal language model without retrieval is like a student taking a closed-book exam. Everything they use must already be memorized.

RAG turns this into an open-book exam: