Hooks
Middleware provides two styles of hooks to intercept agent execution:Node-style hooks
Run sequentially at specific execution points.
Wrap-style hooks
Run around each model or tool call.
Node-style hooks
Run sequentially at specific execution points. Use for logging, validation, and state updates. Choose the hooks your middleware needs. You can choose between node-style hooks and wrap-style hooks. Node-style hooks run at specific execution points:| Hook | When it runs |
|---|---|
before_agent | Before agent starts (once per invocation) |
before_model | Before each model call |
after_model | After each model response |
after_agent | After agent completes (once per invocation) |
| Hook | When it runs |
|---|---|
wrap_model_call | Around each model call |
wrap_tool_call | Around each tool call |
- Decorator
- Class
Wrap-style hooks
Intercept execution and control when the handler is called. Use for retries, caching, and transformation. You decide if the handler is called zero times (short-circuit), once (normal flow), or multiple times (retry logic). Available hooks:wrap_model_call- Around each model callwrap_tool_call- Around each tool call
- Decorator
- Class
State updates
Both node-style and wrap-style hooks can update agent state. The mechanism differs:- Node-style hooks (
before_agent,before_model,after_model,after_agent): Return a dict directly. The dict is applied to the agent state using the graph’s reducers. - Wrap-style hooks (
wrap_model_call,wrap_tool_call): For model calls, returnExtendedModelResponsewith aCommandto inject state updates alongside the model response. For tool calls, return aCommanddirectly. Use these when you need to track or update state based on logic that runs during the model or tool call, such as summarization trigger points, usage metadata, or custom fields calculated from the request or response.
Node-style hooks
Return a dict from a node-style hook to merge updates into agent state. The dict keys map to state fields.Wrap-style hooks
Return aExtendedModelResponse with a Command from wrap_model_call to inject state updates from the model call layer:
Command flows through the graph’s reducers, so updates are applied correctly and messages are additive rather than replacing existing state.
Composition with multiple middleware
When multiple middleware layers returnExtendedModelResponse, their commands compose:
- Commands are applied through reducers: Each
Commandbecomes a separate state update. For messages, this means they are additive. - Outer wins on conflicts: For non-reducer state fields, commands are applied inner-first, then outer. The outermost middleware’s value takes precedence on conflicting keys.
- Retry-safe: If the outer middleware implements logic that can result in multiple calls to
handler()again (for example, retry logic), commands from earlier calls are discarded.
Create middleware
You can create middleware in two ways:Decorator-based middleware
Quick and simple for single-hook middleware. Use decorators to wrap individual functions.
Class-based middleware
More powerful for complex middleware with multiple hooks or configuration.
Decorator-based middleware
Quick and simple for single-hook middleware. Use decorators to wrap individual functions. Available decorators: Node-style:@before_agent- Runs before agent starts (once per invocation)@before_model- Runs before each model call@after_model- Runs after each model response@after_agent- Runs after agent completes (once per invocation)
@wrap_model_call- Wraps each model call with custom logic@wrap_tool_call- Wraps each tool call with custom logic
@dynamic_prompt- Generates dynamic system prompts
- Single hook needed
- No complex configuration
- Quick prototyping
Class-based middleware
More powerful for complex middleware with multiple hooks or configuration. Use classes when you need to define both sync and async implementations for the same hook, or when you want to combine multiple hooks in a single middleware. Example:- Defining both sync and async implementations for the same hook
- Multiple hooks needed in a single middleware
- Complex configuration required (e.g., configurable thresholds, custom models)
- Reuse across projects with init-time configuration
Custom state schema
If your middleware needs to track state across hooks, middleware can extend the agent’s state with custom properties. This enables middleware to:- Track state across execution: Maintain counters, flags, or other values that persist throughout the agent’s execution lifecycle
-
Share data between hooks: Pass information from
before_modeltoafter_modelor between different middleware instances - Implement cross-cutting concerns: Add functionality like rate limiting, usage tracking, user context, or audit logging without modifying the core agent logic
- Make conditional decisions: Use accumulated state to determine whether to continue execution, jump to different nodes, or modify behavior dynamically
- Decorator
- Class
Execution order
When using multiple middleware, understand how they execute:Execution flow
Execution flow
Before hooks run in order:
middleware1.before_agent()middleware2.before_agent()middleware3.before_agent()
middleware1.before_model()middleware2.before_model()middleware3.before_model()
middleware1.wrap_model_call()→middleware2.wrap_model_call()→middleware3.wrap_model_call()→ model
middleware3.after_model()middleware2.after_model()middleware1.after_model()
middleware3.after_agent()middleware2.after_agent()middleware1.after_agent()
before_*hooks: First to lastafter_*hooks: Last to first (reverse)wrap_*hooks: Nested (first middleware wraps all others)
Agent jumps
To exit early from middleware, return a dictionary withjump_to:
Available jump targets:
'end': Jump to the end of the agent execution (or the firstafter_agenthook)'tools': Jump to the tools node'model': Jump to the model node (or the firstbefore_modelhook)
- Decorator
- Class
Best practices
- Keep middleware focused - each should do one thing well
- Handle errors gracefully - don’t let middleware errors crash the agent
- Use appropriate hook types:
- Node-style for sequential logic (logging, validation)
- Wrap-style for control flow (retry, fallback, caching)
- Clearly document any custom state properties
- Unit test middleware independently before integrating
- Consider execution order - place critical middleware first in the list
- Use built-in middleware when possible
Examples
Dynamic prompt
Dynamically modify the system prompt at runtime to inject context, user-specific instructions, or other information before each model call. This is one of the most common middleware use cases. Use thesystem_message field on ModelRequest to read and modify the system prompt. It contains a SystemMessage object (even if the agent was created with a string system_prompt).
- Decorator
- Class
ModelRequest.system_messageis always aSystemMessageobject, even if the agent was created withsystem_prompt="string"- Use
SystemMessage.content_blocksto access content as a list of blocks, regardless of whether the original content was a string or list - When modifying system messages, use
content_blocksand append new blocks to preserve existing structure - You can pass
SystemMessageobjects directly tocreate_agent’ssystem_promptparameter for advanced use cases like cache control
Dynamic model selection
- Decorator
- Class
Dynamically selecting tools
Select relevant tools at runtime to improve performance and accuracy. This section covers filtering pre-registered tools. For registering tools that are discovered at runtime (e.g., from MCP servers), see Runtime tool registration. Benefits:- Shorter prompts - Reduce complexity by exposing only relevant tools
- Better accuracy - Models choose correctly from fewer options
- Permission control - Dynamically filter tools based on user access
- Decorator
- Class
Tool call monitoring
- Decorator
- Class
Prompt caching (Anthropic)
When working with Anthropic models, use structured content blocks with cache control directives to cache large system prompts:- Decorator
- Class
ModelRequest.system_messageis always aSystemMessageobject, even if the agent was created withsystem_prompt="string"- Use
SystemMessage.content_blocksto access content as a list of blocks, regardless of whether the original content was a string or list - When modifying system messages, use
content_blocksand append new blocks to preserve existing structure - You can pass
SystemMessageobjects directly tocreate_agent’ssystem_promptparameter for advanced use cases like cache control
Additional resources
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

