RAG System Overview
A production-grade Retrieval-Augmented Generation system for document processing, semantic search, and AI-powered Q&A
RAG System
A production-grade Retrieval-Augmented Generation (RAG) system built with Python and FastAPI. It ingests documents from multiple sources, processes and embeds them into a vector store, and provides semantic search with LLM-powered question answering.
What It Does
The RAG system enables you to:
- Ingest documents from URLs (web scraping) or file uploads (PDF, DOCX, PPTX, HTML, images, CSV, XLSX)
- Process and chunk text with intelligent splitting, deduplication, and language detection
- Store embeddings in ChromaDB with metadata in PostgreSQL
- Search semantically using hybrid vector + keyword search with cross-encoder reranking
- Generate answers using an LLM grounded in your document knowledge base
Key Capabilities
| Capability | Details |
|---|---|
| Document Formats | PDF, DOCX, PPTX, HTML, Markdown, CSV, XLSX, PNG, JPG, TIFF |
| OCR Support | Tesseract + RapidOCR for scanned documents and images |
| Languages | 50+ languages with specialized Arabic/RTL support |
| Search | Hybrid search (vector + BM25) with RRF fusion |
| Reranking | Cross-encoder reranking via HuggingFace TEI |
| LLM | Ollama integration (Gemma3, LLaMA, etc.) |
| Storage | ChromaDB (vectors) + PostgreSQL (metadata) |
| API | FastAPI REST API with OpenAPI/Swagger docs |
| Async | Full async pipeline with background task support |
Technology Stack
Architecture at a Glance
The system follows a layered architecture:
- API Layer — FastAPI server with authentication, routing, and request validation
- Core Layer — RAGSystem orchestrator, DocumentManager, and SemanticRetriever
- Service Layer — Embedder, Reranker, LLM client, and BM25 index
- Storage Layer — ChromaDB vector store and PostgreSQL metadata database
- External Services — Ollama (embeddings + LLM) and HuggingFace TEI (reranking)
Quick Links
- Getting Started — Installation and setup
- Architecture — System design and diagrams
- API Reference — REST API endpoints
- Core Modules — Component documentation
- Deployment — Docker and production setup