Skip to main content
An agent is a model calling tools in a loop until a given task is complete. Core agent loop diagram
Agent = Model + HarnessThe job of a harness: get the model the right context at the right time for the given task.
A harness is everything around that loop: the model, its prompt, its tools, and any middleware that shapes its behavior. create_agent is a highly configurable harness. At its simplest, you can create one with:
import { createAgent } from "langchain";

var agent = createAgent({ model: "google-genai:gemini-3.5-flash", tools });
Building on that, you can configure the basics directly with the model=, tools=, and system_prompt= parameters. For more advanced capabilities, extend the harness with middleware.

Core components

Agent model and harness components diagram

Model

Pass a model identifier string ("provider:model") or an initialized model instance to select the model for your agent. See Models for parameters, provider setup, and dynamic model selection.
import { createAgent } from "langchain";

var agent = createAgent({ model: "google-genai:gemini-3.5-flash", tools });

Tools

To provide the agent with tools, pass any Python callable, LangChain tool, or tool dict. See Tools for tool definition, context access, and dynamic tool selection.
import { tool } from "langchain";
import * as z from "zod";

var search = tool(({ query }) => `Results for: ${query}`, {
  name: "search",
  description: "Search for information",
  schema: z.object({ query: z.string() }),
});

var agent = createAgent({ model: "google-genai:gemini-3.5-flash", tools: [search] });

System prompt

Shape how the agent approaches tasks. The system prompt parameter accepts a string or SystemMessage. For dynamic prompts at runtime, use middleware.
var agent = createAgent({
  model: "google-genai:gemini-3.5-flash",
  tools,
  systemPrompt: "You are a helpful assistant. Be concise and accurate.",
});

Structured output

Return a validated schema from the agent using response_format=. See Structured output for strategies and examples.
const Answer = z.object({ summary: z.string(), confidence: z.number() });

var agent = createAgent({
  model: "google-genai:gemini-3.5-flash",
  tools,
  responseFormat: Answer,
});
const result = await agent.invoke({
  messages: [{ role: "user", content: "Summarize AI trends" }],
});
result.structuredResponse; // { summary: ..., confidence: ... }

Invocation

Trace each step of this loop, debug tool calls, and evaluate agent outputs with LangSmith. Follow the tracing quickstart to get set up. We recommend you also set up LangSmith Engine which monitors your traces, detects issues, and proposes fixes.
You can invoke an agent with a message. Behind the scenes that passes an update to the agent’s State. All agents include a sequence of messages in their state; to invoke the agent, pass a new message along with a thread_id so the agent can persist and resume conversation history:
import { AIMessage } from "@langchain/core/messages";
import { createAgent } from "langchain";
import { MemorySaver } from "@langchain/langgraph";

const agent = createAgent({
  model: "google-genai:gemini-3.5-flash",
  tools: [],
  checkpointer: new MemorySaver(),
});

const config = { configurable: { thread_id: crypto.randomUUID() } };

let result = await agent.invoke(
  {
    messages: [
      { role: "user", content: "What's the weather in San Francisco?" },
    ],
  },
  config,
);

// A follow-up turn on the same conversation: reuse the same thread_id to keep history
result = await agent.invoke(
  { messages: [{ role: "user", content: "What about tomorrow?" }] },
  config,
);
Persisting conversation history with thread_id requires the agent to be configured with a checkpointer. When deployed on LangSmith, a checkpointer is provisioned automatically. Locally, pass one explicitly, for example create_agent(..., checkpointer=InMemorySaver()).
If you also need to pass per-run configuration (such as a user ID, API keys, or feature flags) to tools and middleware, pass it as context alongside the config. Define the shape of that data with contextSchema and access it through runtime.context:
import * as z from "zod";
import { AIMessage } from "@langchain/core/messages";
import { createAgent } from "langchain";
import { MemorySaver } from "@langchain/langgraph";

const contextSchema = z.object({
  user_id: z.string(),
});

const agent = createAgent({
  model: "google-genai:gemini-3.5-flash",
  tools: [],
  contextSchema,
  checkpointer: new MemorySaver(),
});

const result = await agent.invoke(
  {
    messages: [
      { role: "user", content: "What's the weather in San Francisco?" },
    ],
  },
  {
    configurable: { thread_id: crypto.randomUUID() },
    context: { user_id: "user-123" },
  },
);
thread_id scopes the conversation (message history, checkpoints), while context carries per-run data your tools and middleware read at invocation time. Both are commonly passed together. See tool context and Runtime for more.

Streaming

invoke returns the final response at the end of a run. If an agent executes multiple tool calls, users often need progress updates before completion. Use streaming to surface intermediate messages and tool activity as they happen.
const stream = await agent.streamEvents(
  {
    messages: [
      {
        role: "user",
        content: "Search for AI news and summarize the findings",
      },
    ],
  },
  { version: "v3" },
);

for await (const snapshot of stream.values) {
  // Each snapshot contains the full state at that point
  const latestMessage = snapshot.messages.at(-1);
  if (latestMessage?.content) {
    if (latestMessage.type === "human") {
      console.log(`User: ${latestMessage.content}`);
    } else if (latestMessage.type === "ai") {
      console.log(`Agent: ${latestMessage.content}`);
    }
  } else if (latestMessage?.tool_calls?.length) {
    const toolCallNames = latestMessage.tool_calls.map((tc) => tc.name);
    console.log(`Calling tools: ${toolCallNames.join(", ")}`);
  }
}
For streaming modes, event types, and UI patterns, see Streaming.

Configure the harness

create_agent is highly extensible. Middleware is the primitive for customization: each piece handles one concern, hooks into the agent loop at the right moment, and composes freely with any other. Take exactly what your use case needs and skip the rest. Common patterns are prebuilt as first-class middleware. You can build anything else as custom middleware. Agent harness capabilities by category As agents take on complex work, they need support across a few key areas. The middleware ecosystem provides:

Execution environment

Tools, filesystem, sandboxes, and code execution

Context management

Summarization, memory, skills, and prompt caching

Planning and delegation

Todo lists and subagents for parallel, isolated work

Fault tolerance

Retries, fallbacks, and call limits

Guardrails

PII detection and content controls

Steering

Human-in-the-loop approval before high-impact actions
create_deep_agent pre-assembles this stack for long-running coding and research tasks (filesystem, summarization, subagents, and prompt caching included by default). See Deep Agents for the full prebuilt harness.

Execution environment

Agents are especially useful when they can take action rather than just generate text. The execution environment gives the agent a workspace: tools it can call, a filesystem for reading and writing files across turns, and code execution for running scripts or shell commands.
import { createAgent } from "langchain";
import { createFilesystemMiddleware, StateBackend } from "deepagents";

var agent = createAgent({
  model: "google-genai:gemini-3.5-flash",
  tools: [search],
  middleware: [createFilesystemMiddleware({ backend: new StateBackend() })],
});
See FilesystemMiddleware, Sandboxes, Interpreters.

Context management

Every model call has a fixed context window. As an agent runs, that window fills with accumulating history, tool results, and intermediate steps. Summarization compresses history before overflow hits; memory loads persistent instructions at startup so knowledge carries across sessions; skills surface domain knowledge on demand rather than loading everything upfront.
import { createAgent } from "langchain";
import {
  StateBackend,
  createFilesystemMiddleware,
  createSkillsMiddleware,
  createSummarizationMiddleware,
} from "deepagents";

var backend = new StateBackend();
const model = "anthropic:claude-sonnet-4-6";

var agent = createAgent({
  model,
  tools: [search],
  middleware: [
    createFilesystemMiddleware({ backend }),
    createSummarizationMiddleware({ model, backend }),
    createSkillsMiddleware({ backend, sources: ["./skills/"] }),
  ],
});
See SummarizationMiddleware, MemoryMiddleware, Skills, Context engineering.

Planning and delegation

Complex tasks often exceed what one context window can handle. Delegation lets the main agent break work into pieces, hand them to subagents that each run in their own isolated context, and stay focused on coordination rather than execution. Work can run in parallel; the main agent’s context stays clean.
import { createAgent, todoListMiddleware, tool } from "langchain";
import {
  createFilesystemMiddleware,
  createSubAgentMiddleware,
  StateBackend,
} from "deepagents";
import * as z from "zod";

var search = tool(({ query }) => `Search results for: ${query}`, {
  name: "search",
  description: "Search for a query and return a short summary.",
  schema: z.object({ query: z.string() }),
});

var backend = new StateBackend();

var agent = createAgent({
  model: "google-genai:gemini-3.5-flash",
  tools: [search],
  middleware: [
    createFilesystemMiddleware({ backend }),
    todoListMiddleware(),
    createSubAgentMiddleware({
      defaultModel: "anthropic:claude-sonnet-4-6",
      defaultTools: [],
      subagents: [
        {
          name: "researcher",
          description: "Searches and returns a structured summary.",
          systemPrompt:
            "Use the search tool to research the question and summarize key points.",
          tools: [search],
          model: "anthropic:claude-sonnet-4-6",
          middleware: [],
        },
      ],
    }),
  ],
});
See Subagents.

Name your agent

Optionally use an identifier for the agent. This is especially useful when embedding the agent as a subgraph in multi-agent systems.
var agent = createAgent({
  model: "google-genai:gemini-3.5-flash",
  tools,
  name: "research_assistant",
});

Fault tolerance

Agents in production encounter failures that rarely appear in development: rate limits, model timeouts, transient API errors. Fault tolerance middleware handles these at the infrastructure level so your tools and business logic don’t need try/catch around every call.
import {
  createAgent,
  modelRetryMiddleware,
  tool,
  toolRetryMiddleware,
} from "langchain";
import * as z from "zod";

var search = tool(({ query }) => `Search results for: ${query}`, {
  name: "search",
  description: "Search for a query and return a short summary.",
  schema: z.object({ query: z.string() }),
});

var agent = createAgent({
  model: "google-genai:gemini-3.5-flash",
  tools: [search],
  middleware: [
    modelRetryMiddleware({ maxRetries: 3 }),
    toolRetryMiddleware({ maxRetries: 2 }),
  ],
});
See modelRetryMiddleware, toolRetryMiddleware, Prebuilt middleware.

Guardrails

Some policies can’t live in a prompt—they need to be enforced deterministically regardless of what the model does. Guardrails intercept data as it flows through the agent loop, applying compliance rules or content policies before tool results reach the model’s context.
import { createAgent, piiMiddleware, tool } from "langchain";
import * as z from "zod";

var search = tool(({ query }) => `Search results for: ${query}`, {
  name: "search",
  description: "Search for a query and return a short summary.",
  schema: z.object({ query: z.string() }),
});

var agent = createAgent({
  model: "google-genai:gemini-3.5-flash",
  tools: [search],
  middleware: [piiMiddleware("email")],
});
See piiMiddleware, Prebuilt middleware.

Steering

Full autonomy isn’t always appropriate. Steering lets you place humans at specific decision points—before destructive writes, expensive API calls, or anything requiring judgment—without restructuring your agent. The agent pauses and waits; a human approves, edits, or rejects; execution continues.
import { createAgent, humanInTheLoopMiddleware, tool } from "langchain";
import * as z from "zod";

var search = tool(({ query }) => `Search results for: ${query}`, {
  name: "search",
  description: "Search for a query and return a short summary.",
  schema: z.object({ query: z.string() }),
});

var agent = createAgent({
  model: "google-genai:gemini-3.5-flash",
  tools: [search],
  middleware: [humanInTheLoopMiddleware({ interruptOn: { writeFile: true } })],
});
See humanInTheLoopMiddleware, Human-in-the-loop.

Middleware resources

Middleware overview

How the middleware stack works and when hooks fire

Prebuilt middleware

Full reference with configuration examples

Custom middleware

Write your own hooks for business logic, PII scrubbing, and more