You can stream outputs from a LangGraph agent or workflow.

Supported stream modes

Pass one or more of the following stream modes as a list to the stream() method:
ModeDescription
valuesStreams the full value of the state after each step of the graph.
updatesStreams the updates to the state after each step of the graph. If multiple updates are made in the same step (e.g., multiple nodes are run), those updates are streamed separately.
customStreams custom data from inside your graph nodes.
messagesStreams 2-tuples (LLM token, metadata) from any graph nodes where an LLM is invoked.
debugStreams as much information as possible throughout the execution of the graph.

Stream from an agent

Agent progress

To stream agent progress, use the stream() method with streamMode: "updates". This emits an event after every agent step. For example, if you have an agent that calls a tool once, you should see the following updates:
  • LLM node: AI message with tool call requests
  • Tool node: Tool message with execution result
  • LLM node: Final AI response
const agent = createReactAgent({
  llm: model,
  tools: [getWeather],
});

for await (const chunk of await agent.stream(
  { messages: [{ role: "user", content: "what is the weather in sf" }] },
  { streamMode: "updates" }
)) {
  console.log(chunk);
  console.log("\n");
}

LLM tokens

To stream tokens as they are produced by the LLM, use streamMode: "messages":
const agent = createReactAgent({
  llm: model,
  tools: [getWeather],
});

for await (const [token, metadata] of await agent.stream(
  { messages: [{ role: "user", content: "what is the weather in sf" }] },
  { streamMode: "messages" }
)) {
  console.log("Token", token);
  console.log("Metadata", metadata);
  console.log("\n");
}

Tool updates

To stream updates from tools as they are executed, you can use the writer parameter from the configuration.
import { LangGraphRunnableConfig } from "@langchain/langgraph";

const getWeather = tool(
  async (input, config: LangGraphRunnableConfig) => {
    // Stream any arbitrary data
    config.writer?.("Looking up data for city: " + input.city);
    return `It's always sunny in ${input.city}!`;
  },
  {
    name: "get_weather",
    description: "Get weather for a given city.",
    schema: z.object({
      city: z.string().describe("The city to get weather for."),
    }),
  }
);

const agent = createReactAgent({
  llm: model,
  tools: [getWeather],
});

for await (const chunk of await agent.stream(
  { messages: [{ role: "user", content: "what is the weather in sf" }] },
  { streamMode: "custom" }
)) {
  console.log(chunk);
  console.log("\n");
}
If you add the writer parameter to your tool, you won’t be able to invoke the tool outside of a LangGraph execution context without providing a writer function.

Stream multiple modes

You can specify multiple streaming modes by passing streamMode as an array: streamMode: ["updates", "messages", "custom"]:
const agent = createReactAgent({
  llm: model,
  tools: [getWeather],
});

for await (const chunk of await agent.stream(
  { messages: [{ role: "user", content: "what is the weather in sf" }] },
  { streamMode: ["updates", "messages", "custom"] }
)) {
  console.log(chunk);
  console.log("\n");
}

Disable streaming

In some applications you might need to disable streaming of individual tokens for a given model. This is useful in multi-agent systems to control which agents stream their output. See the Models guide to learn how to disable streaming.

Stream from a workflow

Basic usage example

LangGraph graphs expose the .stream() method to yield streamed outputs as iterators.
for await (const chunk of await graph.stream(inputs, {
  streamMode: "updates",
})) {
  console.log(chunk);
}

Stream multiple modes

You can pass an array as the streamMode parameter to stream multiple modes at once. The streamed outputs will be tuples of [mode, chunk] where mode is the name of the stream mode and chunk is the data streamed by that mode.
for await (const [mode, chunk] of await graph.stream(inputs, {
  streamMode: ["updates", "custom"],
})) {
  console.log(chunk);
}

Stream graph state

Use the stream modes updates and values to stream the state of the graph as it executes.
  • updates streams the updates to the state after each step of the graph.
  • values streams the full value of the state after each step of the graph.
import { StateGraph, START, END } from "@langchain/langgraph";
import { z } from "zod";

const State = z.object({
  topic: z.string(),
  joke: z.string(),
});

const graph = new StateGraph(State)
  .addNode("refineTopic", (state) => {
    return { topic: state.topic + " and cats" };
  })
  .addNode("generateJoke", (state) => {
    return { joke: `This is a joke about ${state.topic}` };
  })
  .addEdge(START, "refineTopic")
  .addEdge("refineTopic", "generateJoke")
  .addEdge("generateJoke", END)
  .compile();
Use this to stream only the state updates returned by the nodes after each step. The streamed outputs include the name of the node as well as the update.
for await (const chunk of await graph.stream(
  { topic: "ice cream" },
  { streamMode: "updates" }
)) {
  console.log(chunk);
}

Stream subgraph outputs

To include outputs from subgraphs in the streamed outputs, you can set subgraphs: true in the .stream() method of the parent graph. This will stream outputs from both the parent graph and any subgraphs. The outputs will be streamed as tuples [namespace, data], where namespace is a tuple with the path to the node where a subgraph is invoked, e.g. ["parent_node:<task_id>", "child_node:<task_id>"].
for await (const chunk of await graph.stream(
  { foo: "foo" },
  {
    subgraphs: true, // (1)!
    streamMode: "updates",
  }
)) {
  console.log(chunk);
}
  1. Set subgraphs: true to stream outputs from subgraphs.

Debugging

Use the debug streaming mode to stream as much information as possible throughout the execution of the graph. The streamed outputs include the name of the node as well as the full state.
for await (const chunk of await graph.stream(
  { topic: "ice cream" },
  { streamMode: "debug" }
)) {
  console.log(chunk);
}

LLM tokens

Use the messages streaming mode to stream Large Language Model (LLM) outputs token by token from any part of your graph, including nodes, tools, subgraphs, or tasks. The streamed output from messages mode is a tuple [message_chunk, metadata] where:
  • message_chunk: the token or message segment from the LLM.
  • metadata: a dictionary containing details about the graph node and LLM invocation.
If your LLM is not available as a LangChain integration, you can stream its outputs using custom mode instead. See use with any LLM for details.
import { ChatOpenAI } from "@langchain/openai";
import { StateGraph, START } from "@langchain/langgraph";
import { z } from "zod";

const MyState = z.object({
  topic: z.string(),
  joke: z.string().default(""),
});

const llm = new ChatOpenAI({ model: "gpt-4o-mini" });

const callModel = async (state: z.infer<typeof MyState>) => {
  // Call the LLM to generate a joke about a topic
  const llmResponse = await llm.invoke([
    { role: "user", content: `Generate a joke about ${state.topic}` },
  ]); // (1)!
  return { joke: llmResponse.content };
};

const graph = new StateGraph(MyState)
  .addNode("callModel", callModel)
  .addEdge(START, "callModel")
  .compile();

for await (const [messageChunk, metadata] of await graph.stream(
  // (2)!
  { topic: "ice cream" },
  { streamMode: "messages" }
)) {
  if (messageChunk.content) {
    console.log(messageChunk.content + "|");
  }
}
  1. Note that the message events are emitted even when the LLM is run using .invoke rather than .stream.
  2. The “messages” stream mode returns an iterator of tuples [messageChunk, metadata] where messageChunk is the token streamed by the LLM and metadata is a dictionary with information about the graph node where the LLM was called and other information.

Filter by LLM invocation

You can associate tags with LLM invocations to filter the streamed tokens by LLM invocation.
import { ChatOpenAI } from "@langchain/openai";

const llm1 = new ChatOpenAI({
  model: "gpt-4o-mini",
  tags: ['joke'] // (1)!
});
const llm2 = new ChatOpenAI({
  model: "gpt-4o-mini",
  tags: ['poem'] // (2)!
});

const graph = // ... define a graph that uses these LLMs

for await (const [msg, metadata] of await graph.stream( // (3)!
  { topic: "cats" },
  { streamMode: "messages" }
)) {
  if (metadata.tags?.includes("joke")) { // (4)!
    console.log(msg.content + "|");
  }
}
  1. llm1 is tagged with “joke”.
  2. llm2 is tagged with “poem”.
  3. The streamMode is set to “messages” to stream LLM tokens. The metadata contains information about the LLM invocation, including the tags.
  4. Filter the streamed tokens by the tags field in the metadata to only include the tokens from the LLM invocation with the “joke” tag.

Filter by node

To stream tokens only from specific nodes, use stream_mode="messages" and filter the outputs by the langgraph_node field in the streamed metadata:
for await (const [msg, metadata] of await graph.stream(
  // (1)!
  inputs,
  { streamMode: "messages" }
)) {
  if (msg.content && metadata.langgraph_node === "some_node_name") {
    // (2)!
    // ...
  }
}
  1. The “messages” stream mode returns a tuple of [messageChunk, metadata] where messageChunk is the token streamed by the LLM and metadata is a dictionary with information about the graph node where the LLM was called and other information.
  2. Filter the streamed tokens by the langgraph_node field in the metadata to only include the tokens from the writePoem node.

Stream custom data

To send custom user-defined data from inside a LangGraph node or tool, follow these steps:
  1. Use the writer parameter from the LangGraphRunnableConfig to emit custom data.
  2. Set streamMode: "custom" when calling .stream() to get the custom data in the stream. You can combine multiple modes (e.g., ["updates", "custom"]), but at least one must be "custom".
import { StateGraph, START, LangGraphRunnableConfig } from "@langchain/langgraph";
import { z } from "zod";

const State = z.object({
  query: z.string(),
  answer: z.string(),
});

const graph = new StateGraph(State)
  .addNode("node", async (state, config) => {
    config.writer({ custom_key: "Generating custom data inside node" }); // (1)!
    return { answer: "some data" };
  })
  .addEdge(START, "node")
  .compile();

const inputs = { query: "example" };

// Usage
for await (const chunk of await graph.stream(inputs, { streamMode: "custom" })) { // (2)!
  console.log(chunk);
}
  1. Use the writer to emit a custom key-value pair (e.g., progress update).
  2. Set streamMode: "custom" to receive the custom data in the stream.

Use with any LLM

You can use streamMode: "custom" to stream data from any LLM API — even if that API does not implement the LangChain chat model interface. This lets you integrate raw LLM clients or external services that provide their own streaming interfaces, making LangGraph highly flexible for custom setups.
import { LangGraphRunnableConfig } from "@langchain/langgraph";

const callArbitraryModel = async (
  state: any,
  config: LangGraphRunnableConfig
) => {
  // Example node that calls an arbitrary model and streams the output
  // Assume you have a streaming client that yields chunks
  for await (const chunk of yourCustomStreamingClient(state.topic)) {
    // (1)!
    config.writer({ custom_llm_chunk: chunk }); // (2)!
  }
  return { result: "completed" };
};

const graph = new StateGraph(State)
  .addNode("callArbitraryModel", callArbitraryModel)
  // Add other nodes and edges as needed
  .compile();

for await (const chunk of await graph.stream(
  { topic: "cats" },
  { streamMode: "custom" } // (3)!
)) {
  // The chunk will contain the custom data streamed from the llm
  console.log(chunk);
}
  1. Generate LLM tokens using your custom streaming client.
  2. Use the writer to send custom data to the stream.
  3. Set streamMode: "custom" to receive the custom data in the stream.

Disable streaming for specific chat models

If your application mixes models that support streaming with those that do not, you may need to explicitly disable streaming for models that do not support it. Set streaming: false when initializing the model.
import { ChatOpenAI } from "@langchain/openai";

const model = new ChatOpenAI({
  model: "o1-preview",
  streaming: false, // (1)!
});