[System design] TutorOS knowledge assistance

Overview

The system enables students and tutors to ask natural-language questions that are answered using textbook content through a Retrieval-Augmented Generation (RAG) approach.

Architecture summary

The system consists of core services (ML and storage layers) reused by two main flows:

An ingestion flow for indexing textbook content
An query flow for real-time question answering

List of components:

Backend API
Embedding model
LLM
Object storage
Vector database
PostgreSQL

Ingestion Flow

Textbooks are collected from trusted sources and processed by a crawler or loader
Content is chunked into semantically meaningful units (e.g., sections or pages)
Each chunk is embedded using an embedding model