Tổng hợp về RAG

Ứng dụng RAG trong việc hỏi đáp tài liệu bài học

https://github.com/scalliontor/RAG.git:

RAG stands for Retrieval-Augmented Generation

Nhận vào một PDF(file dữ liệu text)
Thông qua một mô hình: chia file thành các đoạn thông tin ⇒ đẩy vào cơ sở dữ liệu (Vector Database)
Nhận câu hỏi ⇒ đẩy vào Prompt
Retriever: kiểm tra prompt liên quan tới phần nộ dung nào nhất
Đẩy thông tin vào một model (mô hình ngôn ngữ LLM) để đưa ra phản hồi

Leaderboard VN LLM: https://vmlu.ai/leaderboard

LLMs Quantization

Reduces the model size by converting weights from 32-bit floating point to lower precision formats like 8-bit integers

Retriever

Load File → Text Splitter → Vectorization (Embedding) → Vector Database

Dùng Lang Chain : https://python.langchain.com/docs/tutorials/

Embedding Model