RAG (Retrieval-Augmented Generation)

RAG combines retrieval from your knowledge base with LLM generation. This helps produce answers grounded in your own data.

flowchart LR A[User question] --> B[Retriever] B --> C[Vector DB / Docs] C --> B B --> D[Top relevant chunks] D --> E[LLM] A --> E E --> F[Grounded answer]

Why teams use RAG

Typical RAG pipeline

  1. Ingest docs and split into chunks.
  2. Create embeddings and store in vector DB.
  3. At query time, retrieve top chunks.
  4. Compose prompt with question + chunks.
  5. Generate answer and optionally cite sources.

Example

Internal HR bot answers "How many casual leaves do I get?" using your policy docs, not public internet text.