Hooks
Middleware provides two styles of hooks to intercept agent execution:Node-style hooks
Run sequentially at specific execution points.
Wrap-style hooks
Run around each model or tool call.
Node-style hooks
Run sequentially at specific execution points. Use for logging, validation, and state updates. Choose the hooks your middleware needs. You can choose between node-style hooks and wrap-style hooks. Node-style hooks run at specific execution points:| Hook | When it runs |
|---|---|
beforeAgent | Before agent starts (once per invocation) |
beforeModel | Before each model call |
afterModel | After each model response |
afterAgent | After agent completes (once per invocation) |
| Hook | When it runs |
|---|---|
wrapModelCall | Around each model call |
wrapToolCall | Around each tool call |
Wrap-style hooks
Intercept execution and control when the handler is called. Use for retries, caching, and transformation. You decide if the handler is called zero times (short-circuit), once (normal flow), or multiple times (retry logic). Available hooks:wrapModelCall- Around each model callwrapToolCall- Around each tool call
State updates
Both node-style and wrap-style hooks can update agent state. The mechanism differs:- Node-style hooks (
beforeAgent,beforeModel,afterModel,afterAgent): Return a dict directly. The dict is applied to the agent state using the graph’s reducers. - Wrap-style hooks (
wrapModelCall,wrapToolCall): For model calls, return aCommanddirectly to inject state updates alongside the model response. For tool calls, return aCommanddirectly. Use these when you need to track or update state based on logic that runs during the model or tool call, such as summarization trigger points, usage metadata, or custom fields calculated from the request or response.
Node-style hooks
Return a dict from a node-style hook to merge updates into agent state. The dict keys map to state fields.Wrap-style hooks
Return aCommand directly from wrapModelCall to inject state updates from the model call layer:
Command flows through the graph’s reducers, so updates are applied correctly and messages are additive rather than replacing existing state.
Composition with multiple middleware
When multiple middleware layers return responses, the framework passes on the lastAIMessages produced:
- AIMessage flows through: Each middleware’s
handler()receives theAIMessagefrom the previous layer. When a middleware returns anAIMessage, that becomes the input to the next middleware’s handler. - Command without message updates is pass-through: If a middleware returns a
Commandwhose state update does not touchmessages, the framework treats it as a no-op for message flow. The next middleware’s handler receives theAIMessagefrom the middleware before the one that returned the Command. - Reducer behavior and retry-safety: Commands still apply through reducers (messages additive, outer wins on conflicts). Retry logic discards commands from earlier calls.
Create middleware
Use thecreateMiddleware function to define custom middleware:
Custom state schema
If your middleware needs to track state across hooks, middleware can extend the agent’s state with custom properties. This enables middleware to:- Track state across execution: Maintain counters, flags, or other values that persist throughout the agent’s execution lifecycle
-
Share data between hooks: Pass information from
beforeModeltoafterModelor between different middleware instances - Implement cross-cutting concerns: Add functionality like rate limiting, usage tracking, user context, or audit logging without modifying the core agent logic
- Make conditional decisions: Use accumulated state to determine whether to continue execution, jump to different nodes, or modify behavior dynamically
_) are considered private and will not be included in the agent’s result. Only public fields (those without a leading underscore) are returned.
This is useful for storing internal middleware state that shouldn’t be exposed to the caller, such as temporary tracking variables or internal flags:
Custom context
Middleware can define a custom context schema to access per-invocation metadata. Unlike state, context is read-only and not persisted between invocations. This makes it ideal for:- User information: Pass user ID, roles, or preferences that don’t change during execution
- Configuration overrides: Provide per-invocation settings like rate limits or feature flags
- Tenant/workspace context: Include organization-specific data for multi-tenant applications
- Request metadata: Pass request IDs, API keys, or other metadata needed by middleware
runtime.context in middleware hooks. Required fields in the context schema will be enforced at the TypeScript level, ensuring you must provide them when calling agent.invoke().
contextSchema (fields without .optional() or .default()), TypeScript will enforce that these fields must be provided during agent.invoke() calls. This ensures type safety and prevents runtime errors from missing required context.
Execution order
When using multiple middleware, understand how they execute:Execution flow
Execution flow
Before hooks run in order:
middleware1.before_agent()middleware2.before_agent()middleware3.before_agent()
middleware1.before_model()middleware2.before_model()middleware3.before_model()
middleware1.wrap_model_call()→middleware2.wrap_model_call()→middleware3.wrap_model_call()→ model
middleware3.after_model()middleware2.after_model()middleware1.after_model()
middleware3.after_agent()middleware2.after_agent()middleware1.after_agent()
before_*hooks: First to lastafter_*hooks: Last to first (reverse)wrap_*hooks: Nested (first middleware wraps all others)
Agent jumps
To exit early from middleware, return a dictionary withjump_to:
Available jump targets:
'end': Jump to the end of the agent execution (or the firstafter_agenthook)'tools': Jump to the tools node'model': Jump to the model node (or the firstbefore_modelhook)
Best practices
- Keep middleware focused - each should do one thing well
- Handle errors gracefully - don’t let middleware errors crash the agent
- Use appropriate hook types:
- Node-style for sequential logic (logging, validation)
- Wrap-style for control flow (retry, fallback, caching)
- Clearly document any custom state properties
- Unit test middleware independently before integrating
- Consider execution order - place critical middleware first in the list
- Use built-in middleware when possible
Examples
Dynamic prompt
Dynamically modify the system prompt at runtime to inject context, user-specific instructions, or other information before each model call. This is one of the most common middleware use cases. Use thesystemMessage field in ModelRequest to read and modify the system prompt. It contains a SystemMessage object (even if the agent was created with a string systemPrompt).
SystemMessage.concat to preserve cache control metadata or structured content blocks created by other middleware.
Dynamic model selection
Dynamically selecting tools
Select relevant tools at runtime to improve performance and accuracy. This section covers filtering pre-registered tools. For registering tools that are discovered at runtime (e.g., from MCP servers), see Runtime tool registration. Benefits:- Shorter prompts - Reduce complexity by exposing only relevant tools
- Better accuracy - Models choose correctly from fewer options
- Permission control - Dynamically filter tools based on user access
Tool call monitoring
Prompt caching (Anthropic)
When working with Anthropic models, use structured content blocks with cache control directives to cache large system prompts:- Decorator
- Class
ModelRequest.system_messageis always aSystemMessageobject, even if the agent was created withsystem_prompt="string"- Use
SystemMessage.content_blocksto access content as a list of blocks, regardless of whether the original content was a string or list - When modifying system messages, use
content_blocksand append new blocks to preserve existing structure - You can pass
SystemMessageobjects directly tocreate_agent’ssystem_promptparameter for advanced use cases like cache control
systemMessage field in ModelRequest. It contains a SystemMessage object (even if the agent was created with a string systemPrompt).
Example: Chaining middleware - Different middleware can use different approaches:
SystemMessage.concat to preserve cache control metadata or structured content blocks created by other middleware.
Additional resources
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

