Overview

The system enables students and tutors to ask natural-language questions that are answered using textbook content through a Retrieval-Augmented Generation (RAG) approach.

Architecture summary

The system consists of core services (ML and storage layers) reused by two main flows:

List of components:

image.png

Ingestion Flow

  1. Textbooks are collected from trusted sources and processed by a crawler or loader
  2. Content is chunked into semantically meaningful units (e.g., sections or pages)
  3. Each chunk is embedded using an embedding model