Retrieval-Augmented Generation

How a local AI answers questions from your own documents — no cloud, no API keys

Demo query "What database stores the embeddings?"

Click any node to see what happens at that stage — or hit Trace to step through a query

Input

Output

Tool

Retrieval-Augmented Generation

Overview

RAG gives a language model access to your own documents at query time, so there's no retraining or cloud requirement. It fetches the most relevant passages from a local index and passes them as context so the model answers from your actual source material, not training weights.

Two phases

Build Index runs once when documents change. Answer Queries runs on every question.

Fully local

No API keys, no data leaving the machine. Embedder, vector store, and LLM all run on-device via Ollama and ChromaDB.