Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.langchain.com/llms.txt

Use this file to discover all available pages before exploring further.

Managed Deep Agents is an API-first hosted runtime for creating, running, and operating deep agents in LangSmith. This guide walks through the full flow: define the agent, configure tools, create the managed agent, create a thread, stream a run, and inspect the result in LangSmith.
Managed Deep Agents is in private preview. Join the waitlist to request access.
Deep Agents give developers an open-source harness for building agents that can plan, use tools, delegate to subagents, write files, and work over long horizons. Managed Deep Agents packages the operational layer around that harness, so you can focus on the agent’s behavior instead of running custom runtime infrastructure. For route-level details, see the Managed Deep Agents API reference.

When to use Managed Deep Agents

Use Managed Deep Agents when you want to:
  • Create and manage deep agents programmatically.
  • Run long-running agents without standing up a custom agent server.
  • Stream runs and preserve durable thread state.
  • Use a managed file tree (instructions, skills, subagents, tools, and a /memories/ workspace) without running your own storage.
  • Inspect traces and agent behavior in LangSmith.
You can also deploy Deep Agents with a standard LangSmith Deployment. Use that path when you need custom application code, custom routes, advanced authentication, full Agent Server APIs, stronger isolation controls, or maximum scalability.

Prerequisites

Before you start, make sure you have:
  • Managed Deep Agents private preview access.
  • A LangSmith API key for a workspace with private preview access.
  • A Deep Agent definition, including instructions and any tool configuration you want the managed runtime to use.
Set request defaults for the examples in this guide:
export LANGSMITH_API_KEY="<LANGSMITH_API_KEY>"
export LANGSMITH_API_URL="https://api.smith.langchain.com"
export DEEPAGENTS_BASE_URL="$LANGSMITH_API_URL/v1/deepagents"
Requests require the X-Api-Key header:
X-Api-Key: <LANGSMITH_API_KEY>
Install the HTTP client you want to use to call the REST endpoints — httpx in Python, or the built-in fetch in JavaScript. SDK support is coming in a follow-up release.
pip install httpx

Define the agent

Managed Deep Agents keeps the familiar Deep Agents project shape. Keep these files in your source repository so you can review changes and recreate the managed agent when needed:
File or directoryPurpose
AGENTS.mdDefines the agent instructions.
skills/Contains skills the agent can use.
subagents/Contains subagent definitions for delegated work.
tools.jsonConfigures tools for the managed agent.
When you submit these files in a create-agent or update-agent request, the platform stores them as a versioned agent repo in the Context Hub. Every update creates a new commit (returned as revision on the agent response). The runtime serves the file tree to the agent on every run, and the agent can also read and write files at run time, including a /memories/ directory for durable cross-run state. For the API request, pass the agent instructions and tool configuration directly in the create-agent payload. Use the same tools.json shape when you configure tools:
{
  "tools": [
    {
      "name": "tavily-search",
      "mcp_server_url": "https://mcp.tavily.com/mcp/",
      "mcp_server_name": "tavily",
      "display_name": "tavily-search"
    }
  ],
  "interrupt_config": {
    "https://mcp.tavily.com/mcp/::tavily-search::tavily": false
  }
}
Each mcp_server_url you reference here needs to be registered first via Configure MCP servers so its credentials are available at invocation time. The interrupt_config map lets you require human approval for selected tools. Each key is {mcp_server_url}::{tool_name} (separator is two colons); trailing slashes on the URL are stripped before matching. Additional ::-separated parts after the tool name (such as the MCP server display name shown in the example above) are accepted but ignored when matching. Set the value to true to require human approval before the tool runs, or false to allow it without interrupt.

Configure MCP servers

Tools that need credentials (like custom MCP servers, private endpoints, or any bearer-authenticated API) are registered once per workspace via POST /v1/deepagents/mcp-servers. The platform stores the server’s URL and any credential headers; when an agent’s tools[].mcp_server_url matches a registered server, the stored headers are attached automatically each time the tool is invoked.

Register an MCP server

Credentials are sent as a headers array of {key, value} objects. The headers are stored encrypted at rest and re-attached to every invocation against url. The example below registers Tavily’s hosted MCP server with a bearer Authorization header. You will need a Tavily API key; pass it as the bearer token value:
response = httpx.post(
    f"{BASE_URL}/mcp-servers",
    headers=HEADERS,
    json={
        "name": "tavily",
        "url": "https://mcp.tavily.com/mcp/",
        "headers": [
            {"key": "Authorization", "value": "Bearer tvly-..."},
        ],
    },
)
response.raise_for_status()
mcp_server_id = response.json()["id"]
print(f"MCP server ID: {mcp_server_id}")
Reference the same url from tools[].mcp_server_url in your agent create/update payloads. The MCP server’s credentials are applied automatically; you do not need to pass them again when creating the agent.

List and inspect MCP servers

response = httpx.get(f"{BASE_URL}/mcp-servers", headers=HEADERS)
response.raise_for_status()
servers = response.json()

response = httpx.get(f"{BASE_URL}/mcp-servers/{mcp_server_id}", headers=HEADERS)
response.raise_for_status()
server = response.json()

Rotate credentials

PATCH /v1/deepagents/mcp-servers/{mcp_server_id} replaces the entire headers array. Pass the full new header set; partial diffs are not supported.
response = httpx.patch(
    f"{BASE_URL}/mcp-servers/{mcp_server_id}",
    headers=HEADERS,
    json={
        "headers": [
            {"key": "Authorization", "value": "Bearer tvly-rotated-..."},
        ],
    },
)
response.raise_for_status()

Delete an MCP server

response = httpx.delete(f"{BASE_URL}/mcp-servers/{mcp_server_id}", headers=HEADERS)
response.raise_for_status()
headers may contain secrets (bearer tokens, API keys). The field is omitted from response bodies for callers without invoke permission on the server. Treat list and get responses as sensitive; avoid logging them verbatim.
Managed Deep Agents currently supports MCP servers that authenticate via static headers (such as bearer tokens or custom API-key headers). OAuth-backed registration, including a number of LangSmith-hosted MCP servers that authenticate via OAuth 2.1, is planned for a future release.

Create a managed agent

Create the agent with POST /v1/deepagents/agents. The payload defines the managed resource, runtime settings, instructions, and tools. Each entry in tools[].mcp_server_url must match a server you registered in Configure MCP servers. The platform looks up the URL per workspace and attaches the stored headers at invocation time.
import os

import httpx

BASE_URL = os.environ["DEEPAGENTS_BASE_URL"]
HEADERS = {"X-Api-Key": os.environ["LANGSMITH_API_KEY"]}

response = httpx.post(
    f"{BASE_URL}/agents",
    headers=HEADERS,
    json={
        "name": "research-assistant",
        "description": "Research assistant that can search the web and summarize sources.",
        "runtime": {"model": {"model_id": "anthropic:claude-sonnet-4-6"}},
        "instructions": (
            "You are a careful research assistant. Search for sources, "
            "keep notes, and return concise answers with citations."
        ),
        "tools": {
            "tools": [
                {
                    "name": "tavily-search",
                    "mcp_server_url": "https://mcp.tavily.com/mcp/",
                    "mcp_server_name": "tavily",
                    "display_name": "tavily-search",
                },
            ],
            "interrupt_config": {
                "https://mcp.tavily.com/mcp/::tavily-search::tavily": False,
            },
        },
    },
)
response.raise_for_status()
agent_id = response.json()["id"]
print(f"Agent ID: {agent_id}")

Create a thread

Create a thread before running the agent. Threads preserve the conversation and execution state for long-running work.
response = httpx.post(
    f"{BASE_URL}/threads",
    headers=HEADERS,
    json={
        "agent_id": agent_id,
        "options": {
            "test_run": False,
            "skip_memory_write_protection": False,
        },
    },
)
response.raise_for_status()
thread_id = response.json()["id"]
print(f"Thread ID: {thread_id}")

Stream a run

Start work on the thread with POST /v1/deepagents/threads/{thread_id}/runs/stream. Include the agent_id in the request body and set Accept: text/event-stream so your client receives progress as server-sent events.
payload = {
    "agent_id": agent_id,
    "messages": [
        {
            "role": "user",
            "content": "Research recent approaches to agent memory and summarize the main tradeoffs.",
        }
    ],
    "stream_mode": ["values", "updates", "messages-tuple"],
    "stream_subgraphs": True,
    "user_timezone": "America/Los_Angeles",
}

with httpx.stream(
    "POST",
    f"{BASE_URL}/threads/{thread_id}/runs/stream",
    headers={**HEADERS, "Accept": "text/event-stream"},
    json=payload,
    timeout=None,
) as response:
    response.raise_for_status()
    for line in response.iter_lines():
        if line:
            print(line)
Use stream modes based on the experience you want to build:
Stream modeUse for
valuesFull state snapshots after steps.
updatesIncremental state updates as the agent works.
messages-tupleToken-level message output for chat UIs.
You can combine modes by passing multiple values in stream_mode. The runtime then interleaves events for each requested mode.

Event format

Every stream starts with a metadata event carrying the run ID, then emits events whose names match the requested stream modes. Each event has a data payload of JSON. values mode emits a values event after every step, containing the full graph state:
event: metadata
data: {"run_id":"019e2d68-6b49-7b62-9066-af25f4bb842e","attempt":1}

event: values
data: {"messages":[{"type":"human","content":"Research recent approaches to agent memory","id":"..."}],"files":{},"tools":[]}
updates mode emits an updates event per node, keyed by the node or middleware name that produced the delta. null values indicate the node ran without producing a state update:
event: metadata
data: {"run_id":"019e2d68-e162-7e02-8848-b6d96d19b2ef","attempt":1}

event: updates
data: {"MemoryMiddleware.before_agent":null}

event: updates
data: {"model":{"messages":[{"type":"ai","content":"...","model_name":"claude-sonnet-4-6","usage_metadata":{"input_tokens":15499,"output_tokens":4}}]}}
messages-tuple mode emits a messages event (note the SSE event name is messages, not messages-tuple). Each payload is a [chunk, metadata] tuple — the chunk is a partial AIMessageChunk, and the metadata describes the node and run context:
event: metadata
data: {"run_id":"019e2d68-ed9d-71d0-8aec-f8cdbd1f9f2e","attempt":1}

event: messages
data: [{"type":"AIMessageChunk","content":[{"text":"ok","type":"text","index":0}],"id":"lc_run--..."},{"langgraph_node":"model","ls_model_name":"claude-sonnet-4-6","ls_provider":"anthropic","run_id":"...","thread_id":"...","assistant_id":"..."}]
The run is traced in LangSmith. Open the trace to inspect messages, tool calls, files, subagent activity, and runtime behavior.

Update the agent

Use PATCH /v1/deepagents/agents/{agent_id} when you need to update instructions, runtime settings, or tool configuration. The example below extends the agent created above by adding tavily-extract (Tavily’s structured-extraction tool) alongside the existing tavily-search tool. It also flips tavily-extract to require human approval before each call by setting that tool’s interrupt_config value to true. Because PATCH replaces tools wholesale, pass the full new tool set, not just the additions.
response = httpx.patch(
    f"{BASE_URL}/agents/{agent_id}",
    headers=HEADERS,
    json={
        "description": "Research assistant that searches the web and extracts structured content from pages.",
        "runtime": {"model": {"model_id": "anthropic:claude-sonnet-4-6"}},
        "instructions": (
            "You are a careful research assistant. Search for sources, extract structured information "
            "when useful, keep notes, and return concise answers with citations."
        ),
        "tools": {
            "tools": [
                {
                    "name": "tavily-search",
                    "mcp_server_url": "https://mcp.tavily.com/mcp/",
                    "mcp_server_name": "tavily",
                    "display_name": "tavily-search",
                },
                {
                    "name": "tavily-extract",
                    "mcp_server_url": "https://mcp.tavily.com/mcp/",
                    "mcp_server_name": "tavily",
                    "display_name": "tavily-extract",
                },
            ],
            "interrupt_config": {
                "https://mcp.tavily.com/mcp/::tavily-search::tavily": False,
                "https://mcp.tavily.com/mcp/::tavily-extract::tavily": True,
            },
        },
    },
)
response.raise_for_status()

Manage existing agents

List agents:
response = httpx.get(f"{BASE_URL}/agents", headers=HEADERS)
response.raise_for_status()
agents = response.json()
print(json.dumps(agents, indent=4))
Get one agent:
response = httpx.get(f"{BASE_URL}/agents/{agent_id}", headers=HEADERS)
response.raise_for_status()
agent = response.json()
print(json.dumps(agent, indent=4))
Delete an agent:
response = httpx.delete(f"{BASE_URL}/agents/{agent_id}", headers=HEADERS)
response.raise_for_status()
Deleting an agent does not cascade to its threads. Existing threads remain queryable, but starting new runs on them returns 502. Track and delete threads explicitly when you want to clean them up.

Inspect the result in LangSmith

Managed Deep Agents runs are traced in LangSmith, so teams can debug behavior and inspect tool calls. Use traces to review:
  • The user’s input and the agent’s final response.
  • Model calls and tool calls.
  • Subagent activity.
  • Files and runtime state created during the run.

Limits and notes

Operational notes that apply during the private preview. Behavior may change before general availability.

Supported models

Pass model identifiers in the form {provider}:{model_id}. For example, anthropic:claude-sonnet-4-6 or openai:gpt-5.4-mini. The runtime resolves models with init_chat_model, so any provider that init_chat_model supports is usable from Managed Deep Agents. See Supported providers and models for the current list. Values without a colon are interpreted as references to a saved Playground configuration rather than as model identifiers, so always supply the full {provider}:{model_id} form when configuring a model directly.

Thread retention

Threads have no retention window or per-workspace cap during private preview. Create as many as you need; existing threads remain accessible for the duration of the preview.

Rate limits and quotas

The Managed Deep Agents endpoints do not enforce per-key, per-workspace, or per-agent rate limits during private preview.

Deleting agents

DELETE /agents/{agent_id} does not cascade to threads. Threads created against a deleted agent remain queryable but cannot start new runs, and new attempts return 502. Track and delete threads explicitly when you want to clean them up, as covered in Manage existing agents.

API stability

Routes live under /v1/deepagents, but the surface is in private preview and may change in backwards-incompatible ways before general availability. Breaking changes are communicated to preview customers directly through the contact provided when access was granted.

SDKs

REST is the only supported interface during private preview. Use any HTTP client (httpx or requests in Python, fetch in JavaScript) against https://api.smith.langchain.com/v1/deepagents. SDK wrappers for Python, JavaScript, and React (useStream) are in progress and will follow in a later release.

Support and feedback

Preview access includes direct support. The contact for bug reports and feature requests is included in the email you receive when access is granted.

Built on open-source Deep Agents

Deep Agents remains open source. Managed Deep Agents is the hosted path for teams that want the Deep Agents harness plus LangSmith-managed runtime infrastructure. You can keep the agent definition in your repository, then use the Managed Deep Agents API to create and operate managed agents in LangSmith.

Private preview scope

Managed Deep Agents is available on LangSmith Cloud in the US region only during private preview. Self-hosted and Hybrid deployments are not supported, and EU and other regions will follow general availability. The API also does not mirror every LangSmith Deployment endpoint in private preview.