API Reference
Query Endpoints
RAG query and semantic search endpoints
Query Endpoints
Base path: /api/v1
POST /api/v1/query
Query the RAG system with a question. Returns an AI-generated answer based on relevant documents.
Request Body:
{
"question": "How do I create an invoice?",
"top_k": 5,
"initial_k": 20,
"include_timing": true,
"include_context": false,
"temperature": 0.7
}| Field | Type | Default | Description |
|---|---|---|---|
question | string | required | The question to ask |
top_k | int | 5 | Number of documents to use as context (1-20) |
initial_k | int | 20 | Documents to retrieve before reranking (1-100) |
include_timing | bool | true | Include timing breakdown in response |
include_context | bool | false | Include raw context text in response |
temperature | float | null | LLM temperature override (0.0-2.0) |
Response:
{
"question": "How do I create an invoice?",
"answer": "To create an invoice, follow these steps...",
"sources": [
{
"title": "Invoice Guide",
"source": "https://docs.example.com/invoices",
"section": "documentation",
"score": 0.95,
"doc_id": "abc123"
}
],
"timing": {
"embedding": 0.05,
"vector_search": 0.12,
"reranking": 2.5,
"retrieval_total": 2.67,
"context_formatting": 0.001,
"llm_generation": 10.0,
"total": 12.67
},
"context_used": null
}Timing Fields
| Field | Description |
|---|---|
embedding | Time to generate query embedding (seconds) |
vector_search | ChromaDB vector similarity search time (seconds) |
reranking | Cross-encoder reranking time (seconds). 0 if disabled. |
retrieval_total | Total retrieval time (embedding + search + reranking) |
context_formatting | Time to format documents for LLM (seconds) |
llm_generation | LLM answer generation time (seconds) |
total | Total end-to-end time (seconds) |
Example:
curl -X POST http://localhost:9000/api/v1/query \
-H "X-API-Key: your-key" \
-H "Content-Type: application/json" \
-d '{"question": "How do I create an invoice?"}'POST /api/v1/search
Semantic search without LLM generation. Returns relevant documents ranked by similarity.
Request Body:
{
"query": "invoice creation process",
"top_k": 10,
"initial_k": 50
}| Field | Type | Default | Description |
|---|---|---|---|
query | string | required | Search query |
top_k | int | 5 | Number of results to return (1-50) |
initial_k | int | 20 | Results to retrieve before reranking (1-100) |
Response:
{
"results": [
{
"text": "Document content here...",
"source": "https://docs.example.com/invoices",
"metadata": {
"title": "Invoice Guide",
"section": "documentation",
"language": "en"
},
"score": 0.95,
"doc_id": "abc123"
}
],
"count": 10
}