Glossary

52 terms — 19 recurring phrases from the agentic-sales guide (linked to the transcript chapters where they appear) plus 33 core LlamaIndex concepts (linked to the official docs). Reading the whole thing instead? Open the agentic-sales guide →

A

agentconcept
In LlamaIndex, an agent is a piece of software that semi-autonomously performs tasks by combining an LLM, memory, and tools, orchestrated in a reasoning loop that decides which tool to use next, and it handles inputs from outside users.
Memory hookThe agent is a reasoning loop that decides which tool to use next and runs it until the task is done.
LlamaIndexDocs →
agentic applicationconcept
An agentic application in LlamaIndex is any system that uses an LLM to make decisions, take actions, and/or interact with the world, typically augmented with tools, memory, and dynamic prompts, and may incorporate prompt chaining, routing, parallelism, orchestration, or reflection.
Memory hookAgentic application is the orchestrator that uses LLMs and tools to direct actions.
LlamaIndexDocs →
AgentWorkflowconcept
In LlamaIndex, AgentWorkflow is a class that enables combining multiple agents into a system where each agent hands off control to coordinate task completion.
Memory hookThe AgentWorkflow is the handoff coordinator that passes control between agents to complete multi-agent tasks.
LlamaIndexDocs →

B

blast radiusphrase
…classification, and discovery. A jammed graph stays inside its own pool, so the blast radius equals the pool, never the platform. The observability plane builds…
Appears inCh 1 Ch 2 Ch 3 Ch 10

C

chat engineconcept
A chat engine is a high-level interface for having a conversation with your data through multiple back-and-forth exchanges, functioning as a stateful counterpart to a query engine by maintaining conversation history to consider previous context in responses.
Memory hookChat engine is the stateful chat partner that remembers past exchanges so you can ask follow-up questions.
LlamaIndexDocs →
Context (workflow)concept
In LlamaIndex workflows, a Context object is passed between steps to store and share state across them, eliminating the need for steps to pass every value explicitly through events.
Memory hookContext is the shared sticky note that steps pass around, storing state so they don’t clutter events with every detail.
LlamaIndexDocs →
cross processphrase
…shape. It needs structured run metadata holding the model output. It also needs cross process propagation, so the trace survives the hop into the Python backend…
Appears inCh 2

D

data connector (Reader)concept
A data connector, often called a Reader, ingests data from different data sources and formats into Documents and Nodes.
Memory hookThe data connector is the translator that ingests data from any source, turning it into Documents and Nodes.
LlamaIndexDocs →
design choicephrase
…nt, with the lower bound of the confidence interval clearing one half. The big design choice here is fidelity against load. Full shadow mode duplicates live tr…
Appears inCh 9 Ch 11
documentconcept
In LlamaIndex, a Document is a general container for any data source—such as PDFs, API responses, or database queries—that preserves text content along with metadata and relationship attributes, and serves as the foundational input for indexing and retrieval-augmented generation workflows.
Memory hookDocument is the original container that carries raw data and metadata before being chunked into nodes.
LlamaIndexDocs →

E

embeddingconcept
In LlamaIndex, an embedding is a numerical representation of data, typically computed for each node and stored in a vector store, so that at query time the system converts the query into an embedding and retrieves nodes whose embeddings are numerically similar, enabling relevance filtering.
Memory hookEmbeddings are the numeric fingerprints that convert your query into a vector for finding similar data.
LlamaIndexDocs →
eventconcept
In LlamaIndex, an event is a user-defined Pydantic object that triggers a workflow step, carries data between steps, and can be emitted to activate subsequent steps.
Memory hookEvent is the data-carrying trigger that activates the next step in a workflow.
LlamaIndexDocs →
exact matchphrase
…n scoring. Three families fit the shape. Deterministic checks cover format and exact match for almost no cost. Reference comparison measures drift from a known…
Appears inCh 5 Ch 12

F

failure modephrase
…e route. No service scaffolding, no per team wiring, no coordination call. One failure mode is worth naming. The registry can become a merge bottleneck as the…
Appears inCh 1 Ch 2 Ch 3 Ch 4 Ch 5 Ch 6 Ch 7 Ch 8 Ch 10 Ch 11 Ch 12
failure mode dominatesphrase
…constraint was reaching the model decision without adding request latency. One failure mode dominates. A load balancer that strips trace headers breaks the cro…
Appears inCh 2 Ch 6 Ch 12
first classphrase
…gn keeps only three classes of attribute on each span, and drops the rest. The first class is request identity. It joins one run to other systems. The client r…
Appears inCh 4 Ch 7 Ch 8 Ch 10 Ch 11
FunctionAgentconcept
FunctionAgent is a type of agent in LlamaIndex that uses an LLM provider's function or tool calling capabilities to execute tools, as described in the documentation for creating agents.
Memory hookFunctionAgent is the direct caller that activates tools using the LLM's own function-calling, not custom prompts.
LlamaIndexDocs →
FunctionToolconcept
FunctionTool in LlamaIndex is a tool that converts any user-defined function into a Tool, automatically inferring the function schema or allowing customization of various aspects.
Memory hookFunctionTool is the automatic converter that turns any Python function into an agent tool by inferring its schema.
LlamaIndexDocs →

I

indexconcept
An index in LlamaIndex is a data structure constructed from documents that stores information in node objects and enables quick retrieval of relevant context for user queries, serving as the core foundation for retrieval-augmented generation (RAG) use‑cases and allowing the creation of query engines and chat engines.
Memory hookThe index is the filing cabinet that slices your documents into searchable chunks and tags each with a vector number.
LlamaIndexDocs →
ingestion pipelineconcept
An ingestion pipeline in LlamaIndex applies a series of transformations to input data, producing nodes that are either returned or inserted into a vector database, with caching of each node‑transformation pair to speed up subsequent runs.
Memory hookThe ingestion pipeline is a smart assembly line that caches each transformation to avoid re-processing the same data.
LlamaIndexDocs →

M

memoryconcept
In LlamaIndex, memory is a core component of agents that stores chat history and is managed by default using the `ChatMemoryBuffer` class, which can be customized separately and passed to an agent.
Memory hookMemory is the agent's chat logbook, storing every message and tool result to guide its next decisions.
LlamaIndexDocs →
model callphrase
…he deeper tool. The second class is call shape, which lets you replay the exact model call. The load bearing field here is the prompt version. Without it, you r…
Appears inCh 4 Ch 10
model decisionphrase
…ability misses this kind of semantic failure. The design fix is to capture the model decision on every call. Then it links that decision to the transport trace…
Appears inCh 2

N

new versionphrase
…t is resampled many times, and a confidence interval is read off the results. A new version ships only when the lower bound clears one half. The team also segme…
Appears inCh 6 Ch 9
nodeconcept
A Node is the atomic unit of data in LlamaIndex, representing a discrete “chunk” from a source Document (such as a text segment or image) that inherits the Document’s metadata and includes its own metadata and relationship information linking to other nodes.
Memory hookNode, the atomic data chunk, inherits its document's metadata and links to sibling nodes.
LlamaIndexDocs →
node parserconcept
A node parser is a transformation that takes a list of Documents and chunks them into Node objects, splitting text on sentence boundaries while respecting a configured chunk size and overlap.
Memory hookThe node parser is the chunker that splits documents into node-sized pieces.
LlamaIndexDocs →
node postprocessorconcept
A node postprocessor takes in a set of retrieved nodes and applies transformations, filtering, or re‑ranking logic to them, and it runs after nodes are retrieved from a retriever and before the response synthesizer generates a response.
Memory hookThe node postprocessor is the quality-control inspector that re-ranks or filters retrieved nodes before the response synthesizer gets them.
LlamaIndexDocs →

O

One failure modephrase
…d one route. No service scaffolding, no per team wiring, no coordination call. One failure mode is worth naming. The registry can become a merge bottleneck as…
Appears inCh 1 Ch 2 Ch 4 Ch 6 Ch 7 Ch 8 Ch 10 Ch 12

P

peer tracephrase
…end to end. After the call returns, the client reads the run identifier and the peer trace id. It attaches them as span attributes. An operator can then pivot f…
Appears inCh 2 Ch 4
prompt versionphrase
…which lets you replay the exact model call. The load bearing field here is the prompt version. Without it, you recorded the call but cannot reproduce it, becau…
Appears inCh 4 Ch 11

Q

query engineconcept
A query engine is a generic interface that takes a natural language query and returns a rich response, most often built on one or more indexes via retrievers.
Memory hookQuery engine is the single-query gatekeeper: it takes a question, retrieves context, and returns a rich answer.
LlamaIndexDocs →
QueryEngineToolconcept
A QueryEngineTool is a type of Tool that wraps an existing query engine, and because agent abstractions inherit from BaseQueryEngine, these tools can also wrap other agents.
Memory hookQueryEngineTool wraps a query engine into a tool so agents can query your data.
LlamaIndexDocs →

R

RAGconcept
RAG in LlamaIndex is a core technique for building data-backed LLM applications that solves the problem of LLMs not being trained on your data by loading, indexing, and storing your data, then at query time filtering it down to the most relevant context and sending that context along with the user query to an LLM to generate a response.
Memory hookRAG is your data's librarian, indexing it to retrieve only the context relevant to each query.
LlamaIndexDocs →
response modeconcept
A response mode is a setting passed as a kwarg to a response synthesizer that determines the strategy used to generate a response from an LLM using a user query and a given set of text chunks, with options such as refine, compact, tree_summarize, simple_summarize, no_text, and accumulate.
Memory hookThe response mode is the assembly line that defines how text chunks are processed into the final answer.
LlamaIndexDocs →
response synthesizerconcept
A response synthesizer generates a response from an LLM by using a user query and a given set of text chunks, and it is used after nodes are retrieved from a retriever and after any node‑postprocessors are run.
Memory hookThe response synthesizer is the assembly line that takes retrieved text chunks and outputs your final answer.
LlamaIndexDocs →
retrieverconcept
A retriever is a component responsible for fetching the most relevant context from an index given a user query or chat message, serving as a key building block in query engines and chat engines.
Memory hookThe retriever is the librarian that fetches the most relevant nodes from the index based on your query.
LlamaIndexDocs →
routerconcept
A router determines which retriever will be used to retrieve relevant context, using a selector based on each candidate's metadata and the query.
Memory hookThe router is a dispatcher that routes each query to the best retriever by checking metadata.
LlamaIndexDocs →
run treephrase
…uals the pool, never the platform. The observability plane builds a distributed run tree. That tree crosses the hop from Type Script to Python, turning one user…
Appears inCh 1 Ch 3 Ch 11

S

Shadow modephrase
…traffic in slow steps. Both lean on the same routing layer in the dispatcher. Shadow mode comes first. The new version runs on real production input, but its…
Appears inCh 9
StartEvent / StopEventconcept
In LlamaIndex, the StartEvent is a special framework-provided event that marks the entry point of a workflow, holding arbitrary attributes passed via the .run() method, while the StopEvent designates final steps and, upon being returned, immediately terminates the workflow and returns the value passed in its result parameter.
Memory hookStartEvent is the green light that launches the run; StopEvent is the red light that halts and hands back the result.
LlamaIndexDocs →
stepconcept
In LlamaIndex, a step is a method decorated with the `@step` decorator that forms a unit of execution within a Workflow, triggered by an Event and capable of emitting further Events to activate subsequent steps.
Memory hookA step is an event-triggered worker that receives one event and emits another to keep the workflow moving.
LlamaIndexDocs →
SummaryIndexconcept
The SummaryIndex is a simple index that stores nodes as a sequential list and, by default, returns all nodes for a query, making it useful for summarization over a whole document.
Memory hookThe SummaryIndex is the firehose index that returns all nodes for full-document summarization.
LlamaIndexDocs →

T

Three designs competedphrase
…scade as a single unit. The run tree is how this platform makes that possible. Three designs competed. Flat logs sharing one correlation identifier are simple…
Appears inCh 3 Ch 7 Ch 8
toolconcept
A tool in LlamaIndex is a generic interface that implements a `__call__` method and returns metadata such as name, description, and function schema, serving as the core abstraction for building agentic systems by defining actions that agents can select and execute.
Memory hookA tool is an agent's API endpoint—a named, described function it selects and calls to act.
LlamaIndexDocs →
ToolSpecconcept
A ToolSpec is a community-contributed specification that defines one or more tools around a single service, such as Gmail, and implements a `to_tool_list` method to provide pre-defined tool collections for common APIs.
Memory hookToolSpec bundles community-contributed tools for a single service into one ready-to-use collection.
LlamaIndexDocs →
trace identifierphrase
…response carries two identifiers back. One is the run identifier and one is the trace identifier. The client records both on the active span. Now an operator ca…
Appears inCh 3 Ch 4
trade offphrase
…Type Script to Python, turning one user action into one debuggable thing. The trade off decided the shape. A monolith with branching prompts has the smallest…
Appears inCh 1 Ch 2 Ch 3 Ch 4 Ch 5 Ch 6 Ch 7 Ch 8 Ch 10 Ch 11 Ch 12
transformationconcept
In LlamaIndex, a transformation is an operation applied to input data within an IngestionPipeline that produces nodes, and common examples include node parsers, text splitters, metadata extractors, and embedding models; each node‑transformation pair is cached to speed up subsequent runs.
LlamaIndexDocs →

V

vector storeconcept
A vector store in LlamaIndex is a specialized database that stores vector embeddings, and at query time it finds data numerically similar to the embedding of the query. It serves as the storage backend for indexes like VectorStoreIndex, which compute and store embeddings for each node.
Memory hookThe vector store is the specialized database that stores embeddings and retrieves numerically similar context for queries.
LlamaIndexDocs →
VectorStoreIndexconcept
A VectorStoreIndex is a type of index that splits documents into nodes, computes a vector embedding for each node, and stores them so that, at query time, the nodes most similar to the query embedding can be retrieved, making it the most widely used index type and the ideal starting point for most implementations.
Memory hookVectorStoreIndex is the librarian that retrieves the most relevant chunks by comparing their vector embeddings.
LlamaIndexDocs →

W

workflowconcept
In LlamaIndex, a workflow is an event-driven, step-based abstraction that controls application execution flow by dividing it into steps triggered by events, where each step emits further events to activate subsequent steps, enabling arbitrarily complex flows such as loops and branching.
Memory hookWorkflow is the event-driven conductor that orchestrates steps and LLM calls in a looping sequence.
LlamaIndexDocs →
worth namingphrase
…ice scaffolding, no per team wiring, no coordination call. One failure mode is worth naming. The registry can become a merge bottleneck as the team count climb…
Appears inCh 1 Ch 3 Ch 7