Deployment Overview

The RAG system supports multiple deployment configurations, from local development to production Docker containers.

Deployment Architecture

Run the API server directly with Python:

# Activate virtual environment
source .venv/bin/activate

# Start the server
uvicorn api.main:app --reload --host 0.0.0.0 --port 9000

Requirements:

Deploy with Docker Compose:

docker compose up -d api

The container connects to external services (Ollama, TEI, PostgreSQL) running on the host via host.docker.internal.

See Docker Deployment for full configuration.

Kubernetes manifests are available in .kubernates/ for orchestrated deployment.

Item	Description
API Keys	Set strong `API_KEYS` in `.env`
CORS	Restrict `API_CORS_ORIGINS` to your domain
Database	Use a managed PostgreSQL/Supabase instance
Ollama	Deploy on a GPU server for best performance
Monitoring	Enable health checks at `/api/v1/health/ready`
Logging	Logs are written to `logs/` directory
Backups	Back up ChromaDB `data/chroma_db/` and PostgreSQL
Workers	Use multiple Uvicorn workers (`--workers 2` default in Docker)