> ## Documentation Index
> Fetch the complete documentation index at: https://docs.langchain.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Integration testing

> Test agents with real LLM APIs by organizing tests, managing keys, handling flakiness, and controlling costs.

Integration tests verify that your agent works correctly with model APIs and external services. Unlike [unit tests](/oss/javascript/langchain/test/unit-testing) that use fakes and mocks, integration tests make actual network calls to confirm that components work together, credentials are valid, and latency is acceptable.

Because LLM responses are nondeterministic, integration tests require different strategies than traditional software tests. This guide covers how to organize, write, and run integration tests for your agents. For general test infrastructure when contributing to LangChain itself, see [Contributing to code](/oss/javascript/contributing/code#running-tests).

## Separate unit and integration tests

Integration tests are slower and require API credentials, so keep them separate from unit tests. This lets you run fast unit tests on every change and reserve integration tests for CI or pre-deploy checks.

Use a file naming convention to separate integration tests. Name integration test files `*.int.test.ts` and configure vitest to exclude them from default runs:

```ts vitest.config.ts theme={"theme":{"light":"catppuccin-latte","dark":"catppuccin-mocha"}}
import { configDefaults, defineConfig } from "vitest/config";

export default defineConfig((env) => {
  if (env.mode === "int") {
    return {
      test: {
        testTimeout: 100_000,
        include: ["**/*.int.test.ts"],
        setupFiles: ["dotenv/config"],
      },
    };
  }

  return {
    test: {
      testTimeout: 30_000,
      exclude: ["**/*.int.test.ts", ...configDefaults.exclude],
    },
  };
});
```

Add scripts to `package.json`:

```json theme={"theme":{"light":"catppuccin-latte","dark":"catppuccin-mocha"}}
{
  "scripts": {
    "test": "vitest",
    "test:integration": "vitest --mode int"
  }
}
```

Run integration tests explicitly:

```bash theme={"theme":{"light":"catppuccin-latte","dark":"catppuccin-mocha"}}
npm run test:integration
```

## Manage API keys

Integration tests require real API credentials. Load them from environment variables so keys stay out of source control.

Add `dotenv/config` as a vitest setup file so environment variables load automatically from `.env`:

```ts vitest.config.ts theme={"theme":{"light":"catppuccin-latte","dark":"catppuccin-mocha"}}
export default defineConfig({
  test: {
    setupFiles: ["dotenv/config"],
  },
});
```

```bash .env theme={"theme":{"light":"catppuccin-latte","dark":"catppuccin-mocha"}}
OPENAI_API_KEY=sk-...
```

Skip tests when keys are missing:

```ts theme={"theme":{"light":"catppuccin-latte","dark":"catppuccin-mocha"}}
import { test } from "vitest";

test.skipIf(!process.env.OPENAI_API_KEY)(
  "agent responds with tool call",
  async () => {
    // ...
  }
);
```

<Warning>
  Add `.env` to your `.gitignore` to avoid committing credentials. In CI, inject secrets through your provider's secrets management (e.g., GitHub Actions secrets).
</Warning>

## Assert on structure, not content

LLM responses vary between runs. Instead of asserting on exact output strings, verify the structural properties of the response: message types, tool call names, argument shapes, and message count.

```ts theme={"theme":{"light":"catppuccin-latte","dark":"catppuccin-mocha"}}
test("agent calls weather tool", async () => {
  const agent = createAgent({ model: "claude-sonnet-4-6", tools: [getWeather] });
  const result = await agent.invoke({
    messages: [new HumanMessage("What's the weather in SF?")]
  });

  const aiMsg = result.messages.find(
    (m) => AIMessage.isInstance(m) && m.tool_calls?.length
  );
  expect(aiMsg).toContainToolCall({ name: "get_weather" });
  expect(result.messages.at(-1)).toBeAIMessage();
});
```

This example uses [custom test matchers](#use-custom-test-matchers). See the section below for setup and the full matcher reference.

<Tip>
  For more rigorous trajectory assertions, use the [AgentEvals](/oss/javascript/langchain/test/evals) evaluators which support fuzzy matching modes like `unordered` and `superset`.
</Tip>

## Use custom test matchers

`langchain` ships [custom vitest matchers](https://vitest.dev/guide/extending-matchers.html) that make structural assertions more readable and produce clear error messages on failure. Register them once in a setup file and they become available on every `expect()` call.

### Set up

Add a vitest setup file that extends `expect` with the LangChain matchers:

```ts vitest.setup.ts theme={"theme":{"light":"catppuccin-latte","dark":"catppuccin-mocha"}}
import { langchainMatchers } from "@langchain/core/testing";

expect.extend(langchainMatchers);
```

Reference it in your vitest config:

```ts vitest.config.ts theme={"theme":{"light":"catppuccin-latte","dark":"catppuccin-mocha"}}
export default defineConfig({
  test: {
    setupFiles: ["vitest.setup.ts"],
  },
});
```

TypeScript types are included automatically, so no extra configuration is needed for autocomplete.

### Check message types

Each message class has a corresponding matcher: `toBeHumanMessage()`, `toBeAIMessage()`, `toBeSystemMessage()`, and `toBeToolMessage()`. Call without arguments to check only the type, or pass a string to also match content:

```ts theme={"theme":{"light":"catppuccin-latte","dark":"catppuccin-mocha"}}
const response = await agent.invoke({
  messages: [new HumanMessage("What's the weather?")]
});
const lastMessage = response.messages.at(-1);

expect(lastMessage).toBeAIMessage();
expect(lastMessage).toBeAIMessage("It's 72°F and sunny.");
```

Pass an object to match specific fields:

```ts theme={"theme":{"light":"catppuccin-latte","dark":"catppuccin-mocha"}}
expect(lastMessage).toBeAIMessage({ name: "weather-bot" });
expect(toolMsg).toBeToolMessage({ tool_call_id: "call_1" });
```

### Assert on tool calls

Three matchers cover tool call assertions on an [`AIMessage`](https://reference.langchain.com/javascript/langchain-core/messages/AIMessage):

```ts theme={"theme":{"light":"catppuccin-latte","dark":"catppuccin-mocha"}}
const response = await agent.invoke({
  messages: [new HumanMessage("Weather in SF and NYC?")]
});
const aiMsg = response.messages.find(
  (m) => AIMessage.isInstance(m) && m.tool_calls?.length
);

// Check that specific tool calls are present (order-independent)
expect(aiMsg).toHaveToolCalls([
  { name: "get_weather", args: { city: "San Francisco" } },
  { name: "get_weather", args: { city: "New York" } },
]);

// Check only the count
expect(aiMsg).toHaveToolCallCount(2);

// Check that at least one tool call matches (supports .not)
expect(aiMsg).toContainToolCall({ name: "get_weather" });
expect(aiMsg).not.toContainToolCall({ name: "send_email" });
```

### Assert on tool messages

`toHaveToolMessages()` takes the full message array and checks the [`ToolMessage`](https://reference.langchain.com/javascript/langchain-core/messages/ToolMessage) instances within it, in order:

```ts theme={"theme":{"light":"catppuccin-latte","dark":"catppuccin-mocha"}}
expect(response.messages).toHaveToolMessages([
  { content: "72°F and sunny in San Francisco" },
  { content: "68°F and cloudy in New York" },
]);
```

### Assert on interrupts and structured responses

`toHaveBeenInterrupted()` checks for a `__interrupt__` field in a [LangGraph interrupt](/oss/javascript/langchain/human-in-the-loop) result. Pass a value to match the interrupt payload:

```ts theme={"theme":{"light":"catppuccin-latte","dark":"catppuccin-mocha"}}
const result = await graph.invoke(input);

expect(result).toHaveBeenInterrupted();
expect(result).toHaveBeenInterrupted("confirm_action");
```

`toHaveStructuredResponse()` checks for a `structuredResponse` field on the result. Pass an object to match specific fields:

```ts theme={"theme":{"light":"catppuccin-latte","dark":"catppuccin-mocha"}}
expect(result).toHaveStructuredResponse();
expect(result).toHaveStructuredResponse({ name: "Alice", age: 30 });
```

### Matcher reference

| Matcher                               | Description                                                                                      |
| ------------------------------------- | ------------------------------------------------------------------------------------------------ |
| `toBeHumanMessage(expected?)`         | Check that the value is a `HumanMessage`. Optionally match content (string) or fields (object).  |
| `toBeAIMessage(expected?)`            | Check that the value is an `AIMessage`. Optionally match content or fields.                      |
| `toBeSystemMessage(expected?)`        | Check that the value is a `SystemMessage`. Optionally match content or fields.                   |
| `toBeToolMessage(expected?)`          | Check that the value is a `ToolMessage`. Optionally match content or fields like `tool_call_id`. |
| `toHaveToolCalls(expected)`           | Check that an `AIMessage` has exactly the given tool calls (order-independent).                  |
| `toHaveToolCallCount(n)`              | Check that an `AIMessage` has exactly `n` tool calls.                                            |
| `toContainToolCall(expected)`         | Check that an `AIMessage` contains at least one matching tool call. Supports `.not`.             |
| `toHaveToolMessages(expected)`        | Check that a message array contains the given `ToolMessage` instances, in order.                 |
| `toHaveBeenInterrupted(value?)`       | Check that a result has an `__interrupt__`. Optionally match the interrupt value.                |
| `toHaveStructuredResponse(expected?)` | Check that a result has a `structuredResponse`. Optionally match specific fields.                |

## Reduce cost and latency

Integration tests that call LLM APIs incur real costs. A few practices help keep test suites fast and affordable:

* **Use smaller models**: `gemini-3.1-flash-lite` or equivalent for tests that only need to verify tool calling and response structure.
* **Set `maxTokens`**: Cap response length to avoid long, expensive completions.
* **Limit test scope**: Test one behavior per test. Avoid end-to-end scenarios that chain many LLM calls when a single-turn test suffices.
* **Run selectively**: Use the test separation from [above](#separate-unit-and-integration-tests) to run integration tests only in CI or before deploy, not on every file save.

```ts theme={"theme":{"light":"catppuccin-latte","dark":"catppuccin-mocha"}}
const agent = createAgent({
  model: "gemini-3.1-flash-lite",
  tools: [getWeather],
  modelArgs: { maxTokens: 256 },
});
```

## Next steps

Learn how to evaluate agent trajectories with deterministic matching or LLM-as-judge evaluators in [Evals](/oss/javascript/langchain/test/evals).

***

<div className="source-links">
  <Callout icon="terminal-2">
    [Connect these docs](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
  </Callout>

  <Callout icon="edit">
    [Edit this page on GitHub](https://github.com/langchain-ai/docs/edit/main/src/oss/langchain/test/integration-testing.mdx) or [file an issue](https://github.com/langchain-ai/docs/issues/new/choose).
  </Callout>
</div>
