Retrieval-augmented generation is the first of the three core agentic AI patterns -- the discipline of feeding a model the specific knowledge it needs at inference time, rather than hoping that knowledge survived pretraining. RAG is cited as a required pattern alongside tool use and memory in roles like the RPM Interactive AI Product Engineer Contract. This article is the entry point: the loop, when to reach for it, when not to, and where to dive deeper inside this knowledge base.
A pure LLM call is closed-world: it can only answer from weights frozen at pretraining time. That is fine for "rephrase this paragraph" and useless for "what does our internal runbook say about a corrupted Postgres replica". RAG turns the model into an open-world agent by giving it a retrieval step before generation -- the agent now has a way to consult sources the way a human engineer would consult a wiki, a codebase, or a spec.
Three things make RAG agentic rather than just "search + LLM":
The minimum viable RAG loop is four steps:
# 1. Index (offline, one-time per document)
chunks = chunk(document)
vectors = embed(chunks)
vector_store.upsert(zip(chunks, vectors))
# 2. Retrieve (per query)
query_vec = embed(user_query)
hits = vector_store.search(query_vec, k=5)
# 3. Augment
prompt = f"""Answer using ONLY the sources below.
Sources:
{format_chunks(hits)}
Question: {user_query}"""
# 4. Generate
answer = llm(prompt)
Every production RAG system is a variation on this loop. The interesting engineering lives in the parameters: how you chunk, which embedding model you pick, how many k you retrieve, how you rerank, and how you compose the final prompt. Each of those is its own article in the RAG & Retrieval section below.
Reach for RAG when:
Do not reach for RAG when:
A useful smell test: if the same answer should always come from the same source document, RAG is right. If the answer is "it depends on the user's question and a tool call", you want tool-use instead. If the answer should improve as the agent has more conversations, you want memory.
This page is a map. Each link below is a deeper article in the knowledge base:
The "agentic AI patterns (RAG, tool use, memory)" trio shows up verbatim in product-engineer JDs. See the RPM Interactive AI Product Engineer Contract for an example role that lists this pattern as a hiring requirement -- pair this overview with the tool-use and memory introductions to cover the full bullet.