1. Doc Chatbot
Overview
A conversational assistant that lets users upload any document (PDF, Word, Markdown) and then ask free-form questions. Behind the scenes it uses retrieval-augmented generation (RAG) to find relevant passages and a generative model to craft answers.
Primary Use Cases
- Knowledge workers: Rapidly find clauses in long policies, specs, proposals
- Students: Query lecture notes, research papers
- Customer support reps: Lookup SOPs or troubleshooting guides
Key Features
- File upload + OCR (for scanned PDFs)
- Semantic search over document chunks (vector store)
- Chat UI with follow-up question support
- On-demand summarization of entire sections
Tech Stack
- Frontend: React + TypeScript + Tailwind (chat interface, file picker)
- Backend: FastAPI (Python) for AI endpoints; optional Go microservice for authentication or rate-limiting
- Database: PostgreSQL (user/docs metadata) + FAISS (vector store)
- AI Models:
- Embeddings:
sentence-transformers/all-MiniLM-L6-v2
- Generator:
google/flan-t5-large
or similar from Hugging Face
Architecture
- Upload service ingests document → PyMuPDF/
pdfplumber
extracts text → chunk + embed → store in FAISS.
- Query service receives user prompt → embed → FAISS k-NN → retrieve top passages → concat with prompt → generate answer via T5.