Deployment
Celery Workers
Distributed task processing with Celery and Redis
Celery Workers
The RAG system includes Celery task definitions for distributed document processing using Redis as the message broker. This is useful for offloading heavy processing (parsing, embedding) from the API server.
Architecture
Setup
1. Start Redis
docker run -d -p 6379:6379 redis:7-alpine2. Configure Redis in .env
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_BROKER_DB=0
REDIS_BACKEND_DB=13. Start Workers
# Single worker (all queues)
python worker.py
# Or start specific queue workers
celery -A celery_app worker --queues=parsing --concurrency=2
celery -A celery_app worker --queues=pipeline --concurrency=4
celery -A celery_app worker --queues=embedding --concurrency=24. Start Flower (Optional Monitoring)
celery -A celery_app flower --port=5555Flower UI is available at http://localhost:5555.
Task Queues
| Queue | Tasks |
|---|---|
parsing | Document parsing (Docling) |
pipeline | Full RAG pipeline processing |
processing | Text cleaning, chunking, deduplication |
embedding | Vector embedding generation |
Task Definitions (tasks/)
| File | Tasks |
|---|---|
pipeline_tasks.py | process_document — full pipeline for a single document |
parser_tasks.py | parse_document — extract content from file |
chunker_tasks.py | chunk_text — split text into chunks |
embedding_tasks.py | embed_chunks — generate embeddings |
lang_tasks.py | detect_language — detect chunk language |
cleaner_tasks.py | clean_text — clean raw text |
dedupe_tasks.py | deduplicate — remove duplicates |
formatter_tasks.py | format_output — format results |
Usage Example
from tasks.pipeline_tasks import process_document
# Submit a task
result = process_document.delay("<p>Hello World</p>", source="test")
# Wait for result
output = result.get(timeout=30)
print(output)When to Use Celery
| Scenario | Recommendation |
|---|---|
| Small deployments (<1000 docs) | Not needed — use async API endpoints |
| Large batch ingestion | Recommended — distribute parsing/embedding |
| Multi-server deployment | Recommended — scale workers independently |
| Real-time responsiveness needed | Recommended — offload heavy processing |