Mjara Docs
Deployment

Docker Deployment

Docker and Docker Compose deployment configuration

Docker Deployment

Dockerfile

The system uses a Python 3.12-slim base image with system dependencies for OCR and document parsing.

Key features:

  • Base: python:3.12-slim
  • System packages: Tesseract OCR (Arabic + English), Poppler, ImageMagick
  • Non-root user: runs as appuser (UID 1000) via gosu
  • Health check: HTTP GET to /api/v1/health
  • Entrypoint: entrypoint.sh creates required directories and drops privileges

Docker Compose

The docker-compose.yml defines the API service with all required configuration:

services:
  api:
    build:
      context: .
      dockerfile: Dockerfile
    container_name: rag-api
    extra_hosts:
      - "host.docker.internal:host-gateway"
    env_file:
      - .env
    environment:
      - API_HOST=0.0.0.0
      - API_PORT=9000
      - API_KEYS=${API_KEYS:-}
      - API_CORS_ORIGINS=${API_CORS_ORIGINS:-*}
      - EMBEDDING_MODEL=${EMBEDDING_MODEL:-nextfire/paraphrase-multilingual-minilm:l12-v2}
      - EMBEDDING_BASE_URL=${EMBEDDING_BASE_URL:-http://host.docker.internal:11434}
      - RERANKER_ENABLED=${RERANKER_ENABLED:-true}
      - RERANKER_BASE_URL=${RERANKER_BASE_URL:-http://host.docker.internal:8090}
      - LLM_MODEL=${LLM_MODEL:-gemma3:latest}
      - LLM_BASE_URL=${LLM_BASE_URL:-http://host.docker.internal:11434}
      - DB_ENABLED=${DB_ENABLED:-false}
      - DB_URL=${DB_URL:-postgresql://postgres:postgres@host.docker.internal:54322/postgres}
      - VLM_ENABLED=${VLM_ENABLED:-true}
      - VLM_MODEL=${VLM_MODEL:-phi4:latest}
      - VLM_BASE_URL=${VLM_BASE_URL:-http://host.docker.internal:11434}
    volumes:
      - ./data:/app/data
      - ./logs:/app/logs
    ports:
      - "${API_PORT:-9000}:9000"
    command: uvicorn api.main:app --host 0.0.0.0 --port 9000 --workers 2
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9000/api/v1/health"]
      interval: 30s
      timeout: 10s
      start_period: 60s
      retries: 3
    restart: unless-stopped
    networks:
      - rag-network

networks:
  rag-network:
    driver: bridge

Start the Service

docker compose up -d api

Host Networking

The container uses host.docker.internal to access services running on the host machine. The extra_hosts directive maps this hostname to the Docker host gateway:

extra_hosts:
  - "host.docker.internal:host-gateway"

This allows the container to reach:

  • Ollama (embeddings, LLM, VLM) on the host at port 11434
  • TEI Reranker on the host at port 8090
  • PostgreSQL on the host at port 54322

Environment Variables for Docker

Override defaults in your .env file or in the environment block:

# External services (accessible via host.docker.internal)
EMBEDDING_BASE_URL=http://host.docker.internal:11434
LLM_BASE_URL=http://host.docker.internal:11434
RERANKER_BASE_URL=http://host.docker.internal:8090
VLM_BASE_URL=http://host.docker.internal:11434

# PostgreSQL (host or remote)
DB_URL=postgresql://user:password@host.docker.internal:54322/database

# API
API_KEYS=your-secret-key
API_PORT=9000

Building the Image

# Build
docker build -t rag-system .

# Run standalone
docker run -d \
  --name rag-api \
  -p 9000:9000 \
  --add-host=host.docker.internal:host-gateway \
  --env-file .env \
  -v $(pwd)/data:/app/data \
  -v $(pwd)/logs:/app/logs \
  rag-system

Volume Mounts

MountPurpose
./data:/app/dataChromaDB storage, scraped docs, uploads
./logs:/app/logsApplication logs

CI/CD

The project includes a .gitlab-ci.yml for GitLab CI/CD with Docker-based build and deploy stages.

On this page