<aside> 🦥

Sources

https://pureai.com/articles/2025/03/03/understanding-rag.aspx

https://learn.microsoft.com/en-us/azure/developer/ai/advanced-retrieval-augmented-generation

https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/

</aside>

The Basics of RAG

<aside> 🦥

Retrieval-Augmented Generation (RAG) enhances AI models by connecting them to external knowledge sources.

How RAG Works

<aside> 🦥

  1. Query processing: User asks a question
  2. Retrieval: System searches a knowledge base for relevant information
  3. Context integration: Selected information is fed to the AI
  4. Generation: AI creates a response using both retrieved information and its knowledge </aside>

Why Use RAG?

<aside> 🦥

RAG vs. Traditional LLMs

Common Applications

🦥 The Secret Ingredient: Vector Databases

<aside> 🦥

How does RAG "find relevant info"? It turns text into embeddings (lists of numbers that capture meaning) and stores them in a vector database (Pinecone, Weaviate, Qdrant, Chroma, pgvector). At query time it finds the chunks whose meaning is closest to your question, not just keyword matches.

</aside>

🤖 Agentic RAG

<aside> 🦥

Basic RAG retrieves once, then answers. Agentic RAG lets the agent decide when to search, what to search for, search multiple times, and even pick which knowledge base to use. Much better for complex, multi-part questions.

</aside>

RAG represents a major advancement in making AI systems more reliable and useful by grounding them in specific, relevant information tailored to your needs.