Mjara Docs
Getting Started

Configuration

Environment variables and settings for the RAG system

Configuration

The RAG system uses Pydantic-based settings loaded from environment variables. All configuration is managed through the .env file.

Environment Variables

Embedding Configuration

VariableDefaultDescription
EMBEDDING_MODELnextfire/paraphrase-multilingual-minilm:l12-v2Ollama embedding model name
EMBEDDING_BASE_URLhttp://localhost:11434Ollama API base URL
EMBEDDING_TIMEOUT60Request timeout in seconds
EMBEDDING_NORMALIZEtrueNormalize embedding vectors
LOCAL_EMBED_MODELsentence-transformers/paraphrase-multilingual-MiniLM-L12-v2Local sentence-transformers model (optional fallback)
LOCAL_EMBED_DEVICE(auto-detect)Device for local model (cpu, cuda, mps)
LOCAL_EMBED_NORMALIZEtrueNormalize local embeddings

LLM Configuration

VariableDefaultDescription
LLM_MODELgemma3:latestOllama LLM model name
LLM_BASE_URLhttps://ollama.dragonteam.devOllama API base URL
LLM_TEMPERATURE0.7Generation temperature (0.0-2.0)
LLM_TIMEOUT120Request timeout in seconds

Reranker Configuration

VariableDefaultDescription
RERANKER_ENABLEDtrueEnable/disable cross-encoder reranking
RERANKER_BASE_URLhttp://localhost:8090HuggingFace TEI server URL
RERANKER_MODELBAAI/bge-reranker-v2-m3Reranker model name
RERANKER_TIMEOUT30Request timeout in seconds
RERANKER_INITIAL_K20Number of candidates to retrieve before reranking

Vector Store Configuration

VariableDefaultDescription
VECTOR_BACKENDchromaVector store backend
VECTOR_PERSIST_PATH./data/chroma_dbLocal ChromaDB storage path
VECTOR_COLLECTION_NAMErag_documentsChromaDB collection name

RAG Configuration

VariableDefaultDescription
RAG_TOP_K5Number of documents to use as context
RAG_INCLUDE_TIMINGtrueInclude timing breakdown in responses
RAG_USE_HYBRIDtrueEnable hybrid search (vector + BM25)

API Configuration

VariableDefaultDescription
API_KEYS(empty)Comma-separated API keys. Empty = auth disabled
API_HOST0.0.0.0Host to bind to
API_PORT9000Port to listen on
API_CORS_ORIGINS*Allowed CORS origins (comma-separated)
API_DEBUGfalseEnable debug mode with auto-reload

Text Chunking Configuration

VariableDefaultDescription
CHUNK_TARGET_WORDS500Target words per chunk
CHUNK_MIN_WORDS100Minimum words per chunk
CHUNK_MAX_WORDS600Maximum words per chunk
CHUNK_OVERLAP_WORDS50Overlap words between chunks

Database Configuration

VariableDefaultDescription
DB_URL(required if enabled)PostgreSQL connection string
DB_ENABLEDtrueEnable/disable PostgreSQL metadata storage
DB_LOG_DUPLICATESfalseLog duplicate detection events

VLM Configuration (Vision Language Model)

VariableDefaultDescription
VLM_ENABLEDtrueEnable image/chart description
VLM_MODELgemma3:4bVision model name
VLM_BASE_URLhttp://localhost:11434Ollama API base URL
VLM_TIMEOUT120Request timeout in seconds
VLM_PROMPT(built-in)Prompt for image description

Redis Configuration (Optional)

VariableDefaultDescription
REDIS_HOSTlocalhostRedis server host
REDIS_PORT6379Redis server port
REDIS_BROKER_DB0Redis DB for Celery broker
REDIS_BACKEND_DB1Redis DB for Celery backend

Example .env File

# Embedding (Ollama API)
EMBEDDING_MODEL=bge-m3:latest
EMBEDDING_BASE_URL=https://your-ollama-server.example.com
EMBEDDING_TIMEOUT=60
EMBEDDING_NORMALIZE=true

# Reranker (HuggingFace TEI)
RERANKER_ENABLED=true
RERANKER_BASE_URL=http://your-reranker-server:8787
RERANKER_MODEL=BAAI/bge-reranker-v2-m3
RERANKER_TIMEOUT=30
RERANKER_INITIAL_K=20

# LLM (Ollama API)
LLM_MODEL=gemma3:latest
LLM_BASE_URL=https://your-ollama-server.example.com
LLM_TEMPERATURE=0.7
LLM_TIMEOUT=120

# Vector Store
VECTOR_BACKEND=chroma
VECTOR_PERSIST_PATH=./data/chroma_db
VECTOR_COLLECTION_NAME=rag_documents

# RAG
RAG_TOP_K=5
RAG_INCLUDE_TIMING=true
RAG_USE_HYBRID=true

# API
API_KEYS=your-secret-key-1,your-secret-key-2
API_HOST=0.0.0.0
API_PORT=9000
API_CORS_ORIGINS=*

# Text Chunking
CHUNK_TARGET_WORDS=500
CHUNK_MIN_WORDS=100
CHUNK_MAX_WORDS=600
CHUNK_OVERLAP_WORDS=50

# PostgreSQL
DB_URL=postgresql://user:password@host:port/database
DB_ENABLED=true

# VLM (Vision Language Model)
VLM_ENABLED=true
VLM_MODEL=gemma3:4b
VLM_BASE_URL=https://your-ollama-server.example.com

# Redis (for Celery - optional)
REDIS_HOST=localhost
REDIS_PORT=6379

Accessing Settings in Code

from config import get_settings

settings = get_settings()

# Access nested settings
print(settings.embedding.model)
print(settings.reranker.model)
print(settings.llm.model)
print(settings.database.url)

The get_settings() function returns a cached singleton, so it can be called multiple times without performance overhead.

On this page