Skip to main content
LangSmith Deployment is a workflow orchestration runtime purpose-built for agent workloads. It provides the managed infrastructure agents need to run reliably in production at scale, supporting the full lifecycle from local development to deployment. LangSmith Deployment is framework-agnostic: you can deploy agents built with LangGraph or other frameworks.
Get started building in minutes with the Cloud agent deployment quickstart.

Agent deployment workflow

Start here if you’re building or operating agent applications. This section is about deploying your application. If you need to set up LangSmith infrastructure, the Platform setup section covers infrastructure options.

Capabilities

Durable execution

At its core, LangSmith Deployment is a durable execution engine. Your agents run on a managed task queue with automatic checkpointing, so any run can be retried, replayed, or resumed from the exact point of interruption, not from scratch. Because execution is durable, agents can do things that would be fragile or impossible in a stateless runtime:
  • Wait for external input. An agent calls interrupt() and the runtime checkpoints its state, frees resources, and waits for a human to approve a transaction, a reviewer to edit a draft, or another system to return results. When Command(resume=...) arrives hours or days later, execution picks up exactly where it stopped. This is the primitive underneath human-in-the-loop workflows and time-travel debugging.
  • Run in the background. Background runs execute without blocking the caller. The runtime manages the full lifecycle (queuing, execution, checkpointing, completion) while the client moves on.
  • Run on a schedule. Cron jobs trigger agent execution on a recurring cadence. A daily summary agent, a weekly report, a periodic data sync. The runtime starts a new execution on schedule with the same durability guarantees.
  • Handle concurrent input. When a user sends new input while an agent is mid-run (double-texting), the runtime can queue it, cancel the in-progress run, or process both in parallel without data races or corrupted state.
  • Retry on failure. Configurable retry policies control backoff, max attempts, and which exceptions trigger retries on a per-node basis. Runs survive process restarts, infrastructure failures, and code revisions mid-execution.
For details on how containers, processes, and the task queue work together, see Agent Server: Runtime architecture. For scaling and throughput tuning, see Configure Agent Server for scale.

Streaming

Agents need to show their work in real time. The runtime provides resumable streaming. If a client disconnects mid-stream (network switch, tab sleep, mobile backgrounding), it reconnects and picks up where it left off. Multiple streaming modes give you control over granularity, from full state snapshots after each step to token-by-token LLM output as it arrives from the provider.

Studio

Studio connects to any Agent Server (local or deployed) and gives you an interactive environment for developing and debugging agents. Visualize execution graphs, inspect state at any checkpoint, step through runs, modify state mid-execution, and branch to explore alternative paths.

Agent composition

Agents don’t run in isolation. RemoteGraph lets any agent call other deployed agents using the same interface you use locally: a research agent delegates to a search agent on a different deployment, a routing agent dispatches to specialized sub-agents. The agents don’t need to know whether they’re calling something local or remote. Native support for MCP and A2A means your deployed agents can expose and consume tool interfaces and agent-to-agent protocols alongside the broader ecosystem.

Deployment options

Same runtime, same APIs. What changes is who manages the infrastructure. For a comparison, refer to Platform setup.

Reference & operations

Securing and customizing your server

Operations

Reference