AG-UI Protocol Explained: The New Standard for Human-Agent Interfaces
As AI agents become capable enough to handle real workflows, a new infrastructure challenge has emerged: how do humans and agents communicate during execution—not just at the start and end, but throughout?
The AG-UI protocol is an open specification designed to solve exactly this. It defines a standard for how AI agents stream events, request input, and surface state to frontend applications and human operators in real time. If MCP (Model Context Protocol) standardized how agents access tools, AG-UI standardizes how agents talk to users.
This guide explains what AG-UI is, why it matters, how it works, and how to start using it in your agent stack.
The Problem AG-UI Solves
Before AG-UI, every team building a human-facing AI agent application had to invent its own communication protocol. How does the agent tell the frontend it's thinking? How does it request a human decision? How does the user send a correction mid-task? How is progress displayed?
The answers were different for every team—often ad hoc, poorly documented, and hard to reuse. This created a fragmented ecosystem where:
- Agent frameworks couldn't share frontend components
- Developers had to rebuild streaming UI infrastructure from scratch for every project
- Users got inconsistent experiences across agent-powered products
- Debugging agent behavior required custom logging in every implementation
AG-UI establishes a shared vocabulary and event structure so that any agent framework can produce events that any AG-UI-compatible frontend can render—without custom integration code.
What Is AG-UI?
AG-UI is an open, streaming event protocol that defines the format and semantics of messages exchanged between AI agents and user-facing interfaces.
It is:
- Transport-agnostic: works over HTTP (Server-Sent Events), WebSockets, or any streaming transport
- Framework-agnostic: can be implemented in any language or agent framework
- Bidirectional: agents send events to the frontend; users send messages and interrupts to the agent
- Stateful: the protocol includes state snapshots so frontends can reconstruct the full agent context at any point
It is not:
- A tool protocol (that's MCP's job)
- An agent framework itself
- A UI component library (though reference implementations exist)
AG-UI vs. MCP: Understanding the Distinction
A common source of confusion is how AG-UI relates to Anthropic's Model Context Protocol (MCP).
| Dimension | MCP | AG-UI |
|---|---|---|
| Purpose | Agent ↔ Tool communication | Agent ↔ Human/Frontend communication |
| Direction | Agent calls tools, receives results | Agent streams events; human sends messages |
| Audience | Tool/server developers | Frontend and agent framework developers |
| Focus | What capabilities the agent can use | How the agent communicates its state and progress |
| Relationship | Handles the "tool" side of the agent | Handles the "user interface" side |
They're complementary. An agent running in production typically uses MCP to access tools (web search, image generation, code execution) and AG-UI to communicate its progress and request human input.
Core Concepts in AG-UI
Event Types
AG-UI defines a standard set of event types that agents emit:
Lifecycle events:
RUN_STARTED/RUN_FINISHED— the agent has begun or completed executionSTEP_STARTED/STEP_FINISHED— a discrete step within the workflow has started or endedRUN_ERROR— the agent encountered an unrecoverable error
Message events:
TEXT_MESSAGE_START/TEXT_MESSAGE_CONTENT/TEXT_MESSAGE_END— streaming text output from the agentTOOL_CALL_START/TOOL_CALL_ARGS/TOOL_CALL_END— the agent is invoking a tool
State events:
STATE_SNAPSHOT— a full snapshot of the current agent stateSTATE_DELTA— an incremental update to the stateMESSAGES_SNAPSHOT— the full conversation history at a given point
Custom events:
CUSTOM— for application-specific events not covered by the standard set
The Human Turn
AG-UI also standardizes how humans interact with running agents. The frontend sends an AgentInput to interrupt, redirect, or provide information to the agent mid-execution. This is distinct from a new conversation turn—the agent is running, and the human is influencing its current task.
Thread-Based Architecture
AG-UI organizes agent runs into threads—persistent conversation contexts that maintain state across multiple runs. A thread in AG-UI is roughly equivalent to a session or conversation in other frameworks, but with explicit protocol support for resuming, branching, and replaying.
How AG-UI Works: A Typical Flow
1. User submits a task via the frontend
2. Frontend sends InitialRun request with RunAgentInput to the agent backend
3. Agent begins execution and emits RUN_STARTED event
4. Agent emits STEP_STARTED for each planning step
5. Agent calls a tool → emits TOOL_CALL_START, TOOL_CALL_ARGS, TOOL_CALL_END
6. Agent generates text → emits TEXT_MESSAGE_START, TEXT_MESSAGE_CONTENT (streaming), TEXT_MESSAGE_END
7. Agent emits STATE_DELTA to update frontend state in real time
8. User decides to redirect the agent → sends AgentInput with correction
9. Agent incorporates the correction and continues
10. Agent emits RUN_FINISHED
The frontend receives these events as a stream and renders them progressively—showing tool calls as they happen, streaming text in real time, and updating a progress indicator based on step events.
Implementing AG-UI
Framework Support
AG-UI is gaining support across major agent frameworks:
- LangGraph: AG-UI events can be emitted from graph nodes using the AG-UI Python SDK
- AG-UI CopilotKit integration: CopilotKit (a React frontend framework for AI) ships native AG-UI support
- Custom implementations: the AG-UI spec is open; any framework can implement it with the event type definitions
Quick Start (Python)
from ag_ui.core import (
RunAgentInput, EventType,
RunStartedEvent, TextMessageStartEvent,
TextMessageContentEvent, TextMessageEndEvent,
RunFinishedEvent,
)
import uuid
async def run_agent(input: RunAgentInput):
run_id = str(uuid.uuid4())
yield RunStartedEvent(
type=EventType.RUN_STARTED,
thread_id=input.thread_id,
run_id=run_id,
)
msg_id = str(uuid.uuid4())
yield TextMessageStartEvent(type=EventType.TEXT_MESSAGE_START, message_id=msg_id, role="assistant")
for chunk in agent.stream(input.messages):
yield TextMessageContentEvent(
type=EventType.TEXT_MESSAGE_CONTENT,
message_id=msg_id,
delta=chunk
)
yield TextMessageEndEvent(type=EventType.TEXT_MESSAGE_END, message_id=msg_id)
yield RunFinishedEvent(type=EventType.RUN_FINISHED, thread_id=input.thread_id, run_id=run_id)
Connecting to AnyCap
When your AG-UI-powered agent needs real-world capabilities—web search, image generation, file storage—AnyCap integrates as a tool layer beneath the orchestration. The agent calls AnyCap tools during its execution loop and emits the corresponding TOOL_CALL_* events so the frontend shows what's happening:
User: "Research the top 5 AI frameworks and create a summary image"
Agent emits: TOOL_CALL_START (tool: "anycap_search", args: {...})
Agent emits: TOOL_CALL_END (result: search results)
Agent emits: TOOL_CALL_START (tool: "anycap_image_generate", args: {...})
Agent emits: TOOL_CALL_END (result: image URL)
Agent emits: TEXT_MESSAGE (streaming summary with embedded image)
This full transparency—surfaced through AG-UI events—is what separates a trustworthy human-agent interface from a black box.
Why AG-UI Matters for Production Agent Applications
If you're building agent-powered products, AG-UI provides:
Component reusability. Frontend components built to the AG-UI spec work with any compliant backend. Build a streaming chat UI once; use it with LangGraph, CrewAI, and AutoGen without changes.
Consistent user experience. Users see the same interaction patterns across different agent workflows because the event types are standardized.
Debugging. AG-UI's state snapshots and event stream give you a complete record of agent execution. Replaying an event stream shows exactly what the agent saw and did at each step.
Human oversight. The AgentInput mechanism for mid-task human intervention is built into the protocol—not bolted on as an afterthought.
Conclusion
AG-UI fills a real gap in the agentic AI infrastructure stack. As agents become more capable and more user-facing, the protocol for how they communicate their state and receive human input becomes as important as the tools they can access.
For developers building agent-powered products in 2026, adopting AG-UI early means building on a foundation that the ecosystem is converging toward—rather than maintaining a bespoke communication layer that becomes a liability as your product grows.
Further reading: