Skip to main content

Overview

The router pattern is a multi-agent architecture where a routing step classifies input and directs it to specialized agents, with results synthesized into a combined response. This pattern excels when your organization’s knowledge lives across distinct verticals—separate knowledge domains that each require their own agent with specialized tools and prompts. In this tutorial, you’ll build a multi-source knowledge base router that demonstrates these benefits through a realistic enterprise scenario. The system will coordinate three specialists:
  • A GitHub agent that searches code, issues, and pull requests.
  • A Notion agent that searches internal documentation and wikis.
  • A Slack agent that searches relevant threads and discussions.
When a user asks “How do I authenticate API requests?”, the router decomposes the query into source-specific sub-questions, routes them to the relevant agents in parallel, and synthesizes results into a coherent answer.

Why use a router?

The router pattern provides several advantages:
  • Parallel execution: Query multiple sources simultaneously, reducing latency compared to sequential approaches.
  • Specialized agents: Each vertical has focused tools and prompts optimized for its domain.
  • Selective routing: Not every query needs every source—the router intelligently selects relevant verticals.
  • Targeted sub-questions: Each agent receives a question tailored to its domain, improving result quality.
  • Clean synthesis: Results from multiple sources are combined into a single, coherent response.

Concepts

We will cover the following concepts:
Router vs. Subagents: The subagents pattern can also route to multiple agents. Use the router pattern when you need specialized preprocessing, custom routing logic, or want explicit control over parallel execution. Use the subagents pattern when you want the LLM to decide which agents to call dynamically.

Setup

Installation

This tutorial requires the langchain and langgraph packages:
npm install langchain @langchain/langgraph
For more details, see our Installation guide.

LangSmith

Set up LangSmith to inspect what is happening inside your agent. Then set the following environment variables:
export LANGSMITH_TRACING="true"
export LANGSMITH_API_KEY="..."

Select an LLM

Select a chat model from LangChain’s suite of integrations:
👉 Read the OpenAI chat model integration docs
npm install @langchain/openai
import { initChatModel } from "langchain";

process.env.OPENAI_API_KEY = "your-api-key";

const model = await initChatModel("gpt-4.1");

1. Define state

First, define the state schemas. We use three types:
  • AgentInput: Simple state passed to each subagent (just a query)
  • AgentOutput: Result returned by each subagent (source name + result)
  • RouterState: Main workflow state tracking the query, classifications, results, and final answer
import { Annotation } from "@langchain/langgraph";

// Simple input state for each subagent
interface AgentInput {
  query: string;
}

// Output from each subagent
interface AgentOutput {
  source: string;
  result: string;
}

// A single routing decision
interface Classification {
  source: "github" | "notion" | "slack";
  query: string;
}

const RouterState = Annotation.Root({
  query: Annotation<string>(),
  classifications: Annotation<Classification[]>(),
  results: Annotation<AgentOutput[]>({
    reducer: (current, update) => current.concat(update),  // Collect parallel results
    default: () => [],
  }),
  finalAnswer: Annotation<string>(),
});
The results field uses a reducer (operator.add in Python, a concat function in JS) to collect outputs from parallel agent executions into a single list.

2. Define tools for each vertical

Create tools for each knowledge domain. In a production system, these would call actual APIs. For this tutorial, we use stub implementations that return mock data. We define 7 tools across 3 verticals: GitHub (search code, issues, PRs), Notion (search docs, get page), and Slack (search messages, get thread).
import { tool } from "langchain";
import { z } from "zod";

const searchCode = tool(
  async ({ query, repo }) => {
    return `Found code matching '${query}' in ${repo || "main"}: authentication middleware in src/auth.py`;
  },
  {
    name: "search_code",
    description: "Search code in GitHub repositories.",
    schema: z.object({
      query: z.string(),
      repo: z.string().optional().default("main"),
    }),
  }
);

const searchIssues = tool(
  async ({ query }) => {
    return `Found 3 issues matching '${query}': #142 (API auth docs), #89 (OAuth flow), #203 (token refresh)`;
  },
  {
    name: "search_issues",
    description: "Search GitHub issues and pull requests.",
    schema: z.object({
      query: z.string(),
    }),
  }
);

const searchPrs = tool(
  async ({ query }) => {
    return `PR #156 added JWT authentication, PR #178 updated OAuth scopes`;
  },
  {
    name: "search_prs",
    description: "Search pull requests for implementation details.",
    schema: z.object({
      query: z.string(),
    }),
  }
);

const searchNotion = tool(
  async ({ query }) => {
    return `Found documentation: 'API Authentication Guide' - covers OAuth2 flow, API keys, and JWT tokens`;
  },
  {
    name: "search_notion",
    description: "Search Notion workspace for documentation.",
    schema: z.object({
      query: z.string(),
    }),
  }
);

const getPage = tool(
  async ({ pageId }) => {
    return `Page content: Step-by-step authentication setup instructions`;
  },
  {
    name: "get_page",
    description: "Get a specific Notion page by ID.",
    schema: z.object({
      pageId: z.string(),
    }),
  }
);

const searchSlack = tool(
  async ({ query }) => {
    return `Found discussion in #engineering: 'Use Bearer tokens for API auth, see docs for refresh flow'`;
  },
  {
    name: "search_slack",
    description: "Search Slack messages and threads.",
    schema: z.object({
      query: z.string(),
    }),
  }
);

const getThread = tool(
  async ({ threadId }) => {
    return `Thread discusses best practices for API key rotation`;
  },
  {
    name: "get_thread",
    description: "Get a specific Slack thread.",
    schema: z.object({
      threadId: z.string(),
    }),
  }
);

3. Create specialized agents

Create an agent for each vertical. Each agent has domain-specific tools and a prompt optimized for its knowledge source. All three follow the same pattern—only the tools and system prompt differ.
import { createAgent } from "langchain";
import { ChatOpenAI } from "@langchain/openai";

const llm = new ChatOpenAI({ model: "gpt-4o" });

const githubAgent = createAgent({
  model: llm,
  tools: [searchCode, searchIssues, searchPrs],
  systemPrompt: `
You are a GitHub expert. Answer questions about code,
API references, and implementation details by searching
repositories, issues, and pull requests.
  `.trim(),
});

const notionAgent = createAgent({
  model: llm,
  tools: [searchNotion, getPage],
  systemPrompt: `
You are a Notion expert. Answer questions about internal
processes, policies, and team documentation by searching
the organization's Notion workspace.
  `.trim(),
});

const slackAgent = createAgent({
  model: llm,
  tools: [searchSlack, getThread],
  systemPrompt: `
You are a Slack expert. Answer questions by searching
relevant threads and discussions where team members have
shared knowledge and solutions.
  `.trim(),
});

4. Build the router workflow

Now build the router workflow using a StateGraph. The workflow has four main steps:
  1. Classify: Analyze the query and determine which agents to invoke with what sub-questions
  2. Route: Fan out to selected agents in parallel using Send
  3. Query agents: Each agent receives a simple AgentInput and returns an AgentOutput
  4. Synthesize: Combine collected results into a coherent response
import { StateGraph, START, END, Send } from "@langchain/langgraph";
import { z } from "zod";

const routerLlm = new ChatOpenAI({ model: "gpt-4o-mini" });


// Define structured output schema for the classifier
const ClassificationResultSchema = z.object({  
  classifications: z.array(z.object({
    source: z.enum(["github", "notion", "slack"]),
    query: z.string(),
  })).describe("List of agents to invoke with their targeted sub-questions"),
});


async function classifyQuery(state: typeof RouterState.State) {
  const structuredLlm = routerLlm.withStructuredOutput(ClassificationResultSchema);  

  const result = await structuredLlm.invoke([
    {
      role: "system",
      content: `Analyze this query and determine which knowledge bases to consult.
For each relevant source, generate a targeted sub-question optimized for that source.

Available sources:
- github: Code, API references, implementation details, issues, pull requests
- notion: Internal documentation, processes, policies, team wikis
- slack: Team discussions, informal knowledge sharing, recent conversations

Return ONLY the sources that are relevant to the query. Each source should have
a targeted sub-question optimized for that specific knowledge domain.

Example for "How do I authenticate API requests?":
- github: "What authentication code exists? Search for auth middleware, JWT handling"
- notion: "What authentication documentation exists? Look for API auth guides"
(slack omitted because it's not relevant for this technical question)`
    },
    { role: "user", content: state.query }
  ]);

  return { classifications: result.classifications };
}


function routeToAgents(state: typeof RouterState.State): Send[] {
  return state.classifications.map(
    (c) => new Send(c.source, { query: c.query })  
  );
}


async function queryGithub(state: AgentInput) {
  const result = await githubAgent.invoke({
    messages: [{ role: "user", content: state.query }]  
  });
  return { results: [{ source: "github", result: result.messages.at(-1)?.content }] };
}


async function queryNotion(state: AgentInput) {
  const result = await notionAgent.invoke({
    messages: [{ role: "user", content: state.query }]  
  });
  return { results: [{ source: "notion", result: result.messages.at(-1)?.content }] };
}


async function querySlack(state: AgentInput) {
  const result = await slackAgent.invoke({
    messages: [{ role: "user", content: state.query }]  
  });
  return { results: [{ source: "slack", result: result.messages.at(-1)?.content }] };
}


async function synthesizeResults(state: typeof RouterState.State) {
  if (state.results.length === 0) {
    return { finalAnswer: "No results found from any knowledge source." };
  }

  // Format results for synthesis
  const formatted = state.results.map(
    (r) => `**From ${r.source.charAt(0).toUpperCase() + r.source.slice(1)}:**\n${r.result}`
  );

  const synthesisResponse = await routerLlm.invoke([
    {
      role: "system",
      content: `Synthesize these search results to answer the original question: "${state.query}"

- Combine information from multiple sources without redundancy
- Highlight the most relevant and actionable information
- Note any discrepancies between sources
- Keep the response concise and well-organized`
    },
    { role: "user", content: formatted.join("\n\n") }
  ]);

  return { finalAnswer: synthesisResponse.content };
}

5. Compile the workflow

Now assemble the workflow by connecting nodes with edges. The key is using add_conditional_edges with the routing function to enable parallel execution:
const workflow = new StateGraph(RouterState)
  .addNode("classify", classifyQuery)
  .addNode("github", queryGithub)
  .addNode("notion", queryNotion)
  .addNode("slack", querySlack)
  .addNode("synthesize", synthesizeResults)
  .addEdge(START, "classify")
  .addConditionalEdges("classify", routeToAgents, ["github", "notion", "slack"])
  .addEdge("github", "synthesize")
  .addEdge("notion", "synthesize")
  .addEdge("slack", "synthesize")
  .addEdge("synthesize", END)
  .compile();
The add_conditional_edges call connects the classify node to the agent nodes through the route_to_agents function. When route_to_agents returns multiple Send objects, those nodes execute in parallel.

6. Use the router

Test your router with queries that span multiple knowledge domains:
const result = await workflow.invoke({
  query: "How do I authenticate API requests?"
});

console.log("Original query:", result.query);
console.log("\nClassifications:");
for (const c of result.classifications) {
  console.log(`  ${c.source}: ${c.query}`);
}
console.log("\n" + "=".repeat(60) + "\n");
console.log("Final Answer:");
console.log(result.finalAnswer);
Expected output:
Original query: How do I authenticate API requests?

Classifications:
  github: What authentication code exists? Search for auth middleware, JWT handling
  notion: What authentication documentation exists? Look for API auth guides

============================================================

Final Answer:
To authenticate API requests, you have several options:

1. **JWT Tokens**: The recommended approach for most use cases.
   Implementation details are in `src/auth.py` (PR #156).

2. **OAuth2 Flow**: For third-party integrations, follow the OAuth2
   flow documented in Notion's 'API Authentication Guide'.

3. **API Keys**: For server-to-server communication, use Bearer tokens
   in the Authorization header.

For token refresh handling, see issue #203 and PR #178 for the latest
OAuth scope updates.
The router analyzed the query, classified it to determine which agents to invoke (GitHub and Notion, but not Slack for this technical question), queried both agents in parallel, and synthesized the results into a coherent answer.

7. Understanding the architecture

The router workflow follows a clear pattern:

Classification phase

The classify_query function uses structured output to analyze the user’s query and determine which agents to invoke. This is where the routing intelligence lives:
  • Uses a Pydantic model (Python) or Zod schema (JS) to ensure valid output
  • Returns a list of Classification objects, each with a source and targeted query
  • Only includes relevant sources—irrelevant ones are simply omitted
This structured approach is more reliable than free-form JSON parsing and makes the routing logic explicit.

Parallel execution with Send

The route_to_agents function maps classifications to Send objects. Each Send specifies the target node and the state to pass:
# Classifications: [{"source": "github", "query": "..."}, {"source": "notion", "query": "..."}]
# Becomes:
[Send("github", {"query": "..."}), Send("notion", {"query": "..."})]
# Both agents execute simultaneously, each receiving only the query it needs
Each agent node receives a simple AgentInput with just a query field—not the full router state. This keeps the interface clean and explicit.

Result collection with reducers

Agent results flow back to the main state via a reducer. Each agent returns:
{"results": [{"source": "github", "result": "..."}]}
The reducer (operator.add in Python) concatenates these lists, collecting all parallel results into state["results"].

Synthesis phase

After all agents complete, the synthesize_results function iterates over the collected results:
  • Waits for all parallel branches to complete (LangGraph handles this automatically)
  • References the original query to ensure the answer addresses what the user asked
  • Combines information from all sources without redundancy
Partial results: In this tutorial, all selected agents must complete before synthesis. For more advanced patterns where you want to handle partial results or timeouts, see the map-reduce guide.

8. Complete working example

Here’s everything together in a runnable script:

9. Advanced: Stateful routers

The router we’ve built so far is stateless—each request is handled independently with no memory between calls. For multi-turn conversations, you need a stateful approach.

Tool wrapper approach

The simplest way to add conversation memory is to wrap the stateless router as a tool that a conversational agent can call:
import { MemorySaver } from "@langchain/langgraph";

const searchKnowledgeBase = tool(
  async ({ query }) => {
    const result = await workflow.invoke({ query });
    return result.finalAnswer;
  },
  {
    name: "search_knowledge_base",
    description: `Search across multiple knowledge sources (GitHub, Notion, Slack).
Use this to find information about code, documentation, or team discussions.`,
    schema: z.object({
      query: z.string().describe("The search query"),
    }),
  }
);

const conversationalAgent = createAgent({
  model: llm,
  tools: [searchKnowledgeBase],
  systemPrompt: `
You are a helpful assistant that answers questions about our organization.
Use the search_knowledge_base tool to find information across our code,
documentation, and team discussions.
  `.trim(),
  checkpointer: new MemorySaver(),
});
This approach keeps the router stateless while the conversational agent handles memory and context. The user can have a multi-turn conversation, and the agent will call the router tool as needed.
const config = { configurable: { thread_id: "user-123" } };

let result = await conversationalAgent.invoke(
  { messages: [{ role: "user", content: "How do I authenticate API requests?" }] },
  config
);
console.log(result.messages.at(-1)?.content);

result = await conversationalAgent.invoke(
  { messages: [{ role: "user", content: "What about rate limiting for those endpoints?" }] },
  config
);
console.log(result.messages.at(-1)?.content);
The tool wrapper approach is recommended for most use cases. It provides clean separation: the router handles multi-source querying, while the conversational agent handles context and memory.

Full persistence approach

If you need the router itself to maintain state—for example, to use previous search results in routing decisions—use persistence to store message history at the router level.
Stateful routers add complexity. When routing to different agents across turns, conversations may feel inconsistent if agents have different tones or prompts. Consider the handoffs pattern or subagents pattern instead—both provide clearer semantics for multi-turn conversations with different agents.

10. Key takeaways

The router pattern excels when you have:
  • Distinct verticals: Separate knowledge domains that each require specialized tools and prompts
  • Parallel query needs: Questions that benefit from querying multiple sources simultaneously
  • Synthesis requirements: Results from multiple sources need to be combined into a coherent response
The pattern has three phases: decompose (analyze the query and generate targeted sub-questions), route (execute queries in parallel), and synthesize (combine results).
When to use the router patternUse the router pattern when you have multiple independent knowledge sources, need low-latency parallel queries, and want explicit control over routing logic.For simpler cases with dynamic tool selection, consider the subagents pattern. For workflows where agents need to converse with users sequentially, consider handoffs.

Next steps


Connect these docs to Claude, VSCode, and more via MCP for real-time answers.