LangSmith Deployment

LangSmith Deployment is a workflow orchestration runtime purpose-built for agent workloads. It handles the infrastructure that agents need to run reliably: long-running stateful execution, human-in-the-loop pauses, real-time streaming, horizontal scaling — all with a first-class Studio development environment.

Start here if you’re building or operating agent applications. This section is about deploying your application. If you need to set up LangSmith infrastructure, the Platform setup section covers infrastructure options (cloud, hybrid, self-hosted) and setup guides.

A typical deployment workflow:

Run your application on a local development server.

Set up dependencies, project structure, and environment configuration.

(Required for deployment) Select Cloud, Hybrid, or Self-hosted.

Deploy your app

Cloud: Push code from a git repository
Hybrid or Self-hosted with control plane: Build and push Docker images, deploy via UI
Standalone servers: Deploy directly without control plane

Track traces, alerts, and dashboards.

Durable execution

At its core, LangSmith Deployment is a durable execution engine. Your agents run on a managed task queue with automatic checkpointing, so any run can be retried, replayed, or resumed from the exact point of interruption — not from scratch. Because execution is durable, agents can do things that would be fragile or impossible in a stateless runtime:

Wait for external input. An agent calls interrupt() and the runtime checkpoints its state, frees resources, and waits — for a human to approve a transaction, for a reviewer to edit a draft, for another system to return results. When Command(resume=...) arrives hours or days later, execution picks up exactly where it stopped. This is the primitive underneath human-in-the-loop workflows and time-travel debugging.
Run in the background. Background runs execute without blocking the caller. The runtime manages the full lifecycle — queuing, execution, checkpointing, completion — while the client moves on.
Run on a schedule. Cron jobs trigger agent execution on a recurring cadence. A daily summary agent, a weekly report, a periodic data sync — the runtime starts a new execution on schedule with the same durability guarantees.
Handle concurrent input. When a user sends new input while an agent is mid-run (double-texting), the runtime can queue it, cancel the in-progress run, or process both in parallel — without data races or corrupted state.
Retry on failure. Configurable retry policies control backoff, max attempts, and which exceptions trigger retries on a per-node basis. Runs survive process restarts, infrastructure failures, and code revisions mid-execution.

For details on how containers, processes, and the task queue work together, see Agent Server: Runtime architecture. For scaling and throughput tuning, see Configure Agent Server for scale.

Streaming

Agents need to show their work in real time. The runtime provides resumable streaming — if a client disconnects mid-stream (network switch, tab sleep, mobile backgrounding), it reconnects and picks up where it left off. Multiple streaming modes give you control over granularity, from full state snapshots after each step to token-by-token LLM output as it arrives from the provider.

Studio

LangGraph Studio connects to any Agent Server — local or deployed — and gives you an interactive environment for developing and debugging agents. Visualize execution graphs, inspect state at any checkpoint, step through runs, modify state mid-execution, and branch to explore alternative paths.

Agent composition

Agents don’t run in isolation. RemoteGraph lets any agent call other deployed agents using the same interface you use locally — a research agent delegates to a search agent on a different deployment, a routing agent dispatches to specialized sub-agents. The agents don’t need to know whether they’re calling something local or remote. Native support for MCP and A2A means your deployed agents can expose and consume tool interfaces and agent-to-agent protocols alongside the broader ecosystem.

Deployment options

Cloud — Fully managed. Push from a git repo.
Hybrid — Runs in your cloud, managed by the LangSmith control plane.
Self-hosted — Fully self-managed in your own infrastructure.

Same runtime, same APIs. What changes is who manages the infrastructure. See Platform setup for a comparison.

Go deeper

Securing and customizing your server

Custom auth — Authentication and multi-tenant access control
Server customization — Custom routes, middleware, lifespan hooks, encryption

Operations

CI/CD pipelines
TTL configuration for state and thread management
Semantic search

Reference

Agent Server — Runtime architecture reference

Edit this page on GitHub or file an issue.

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

Configure app for deployment

Deployment guides

App development

Studio

Auth & access control

Server customization

Durable execution

Streaming

Studio

Agent composition

Deployment options

Go deeper

Securing and customizing your server

Operations

Reference

Configure app for deployment

Deployment guides

App development

Studio

Auth & access control

Server customization

​Durable execution

​Streaming

​Studio

​Agent composition

​Deployment options

​Go deeper

​Securing and customizing your server

​Operations

​Reference

Durable execution

Streaming

Studio

Agent composition

Deployment options

Go deeper

Securing and customizing your server

Operations

Reference