Skip to main content
Agent Server provides a set of capabilities for building and operating production agents. This section covers:

Streaming API

Stream outputs from your deployed agent in real time using the LangGraph SDK.

Human-in-the-loop

Pause agent execution to review, edit, or approve tool calls before continuing.

Time travel

Replay agent runs from any prior state to debug or explore alternative paths.

MCP endpoint

Expose your agents as MCP tools, accessible to any MCP-compliant client.

A2A endpoint

Enable agent-to-agent communication using the A2A protocol.

Distributed tracing

Unify traces across services when calling Agent Server from external applications.

Webhooks

Trigger external systems in response to run events from your deployed agent.

Double-texting

Control how Agent Server handles a new message while a run is already in progress.

Durable execution

At its core, LangSmith Deployment is a durable execution engine. Your agents run on a managed task queue with automatic checkpointing, so any run can be retried, replayed, or resumed from the exact point of interruption, not from scratch. Because execution is durable, agents can do things that would be fragile or impossible in a stateless runtime:
  • Wait for external input. An agent calls interrupt() and the runtime checkpoints its state, frees resources, and waits for a human to approve a transaction, a reviewer to edit a draft, or another system to return results. When Command(resume=...) arrives hours or days later, execution picks up exactly where it stopped. This is the primitive underneath human-in-the-loop workflows and time-travel debugging.
  • Run in the background. Background runs execute without blocking the caller. The runtime manages the full lifecycle (queuing, execution, checkpointing, completion) while the client moves on.
  • Run on a schedule. Cron jobs trigger agent execution on a recurring cadence. A daily summary agent, a weekly report, a periodic data sync. The runtime starts a new execution on schedule with the same durability guarantees.
  • Handle concurrent input. When a user sends new input while an agent is mid-run (double-texting), the runtime can queue it, cancel the in-progress run, or process both in parallel without data races or corrupted state.
  • Retry on failure. Configurable retry policies control backoff, max attempts, and which exceptions trigger retries on a per-node basis. Runs survive process restarts, infrastructure failures, and code revisions mid-execution.
For details on how containers, processes, and the task queue work together, see Agent Server: Runtime architecture. For scaling and throughput tuning, see Configure Agent Server for scale.