Separate unit and integration tests
Integration tests are slower and require API credentials, so keep them separate from unit tests. This lets you run fast unit tests on every change and reserve integration tests for CI or pre-deploy checks. Use a file naming convention to separate integration tests. Name integration test files*.int.test.ts and configure vitest to exclude them from default runs:
vitest.config.ts
package.json:
Manage API keys
Integration tests require real API credentials. Load them from environment variables so keys stay out of source control. Adddotenv/config as a vitest setup file so environment variables load automatically from .env:
vitest.config.ts
.env
Assert on structure, not content
LLM responses vary between runs. Instead of asserting on exact output strings, verify the structural properties of the response: message types, tool call names, argument shapes, and message count.Use custom test matchers
langchain ships custom vitest matchers that make structural assertions more readable and produce clear error messages on failure. Register them once in a setup file and they become available on every expect() call.
Set up
Add a vitest setup file that extendsexpect with the LangChain matchers:
vitest.setup.ts
vitest.config.ts
Check message types
Each message class has a corresponding matcher:toBeHumanMessage(), toBeAIMessage(), toBeSystemMessage(), and toBeToolMessage(). Call without arguments to check only the type, or pass a string to also match content:
Assert on tool calls
Three matchers cover tool call assertions on anAIMessage:
Assert on tool messages
toHaveToolMessages() takes the full message array and checks the ToolMessage instances within it, in order:
Assert on interrupts and structured responses
toHaveBeenInterrupted() checks for a __interrupt__ field in a LangGraph interrupt result. Pass a value to match the interrupt payload:
toHaveStructuredResponse() checks for a structuredResponse field on the result. Pass an object to match specific fields:
Matcher reference
| Matcher | Description |
|---|---|
toBeHumanMessage(expected?) | Check that the value is a HumanMessage. Optionally match content (string) or fields (object). |
toBeAIMessage(expected?) | Check that the value is an AIMessage. Optionally match content or fields. |
toBeSystemMessage(expected?) | Check that the value is a SystemMessage. Optionally match content or fields. |
toBeToolMessage(expected?) | Check that the value is a ToolMessage. Optionally match content or fields like tool_call_id. |
toHaveToolCalls(expected) | Check that an AIMessage has exactly the given tool calls (order-independent). |
toHaveToolCallCount(n) | Check that an AIMessage has exactly n tool calls. |
toContainToolCall(expected) | Check that an AIMessage contains at least one matching tool call. Supports .not. |
toHaveToolMessages(expected) | Check that a message array contains the given ToolMessage instances, in order. |
toHaveBeenInterrupted(value?) | Check that a result has an __interrupt__. Optionally match the interrupt value. |
toHaveStructuredResponse(expected?) | Check that a result has a structuredResponse. Optionally match specific fields. |
Reduce cost and latency
Integration tests that call LLM APIs incur real costs. A few practices help keep test suites fast and affordable:- Use smaller models:
gemini-3.1-flash-lite-previewor equivalent for tests that only need to verify tool calling and response structure. - Set
maxTokens: Cap response length to avoid long, expensive completions. - Limit test scope: Test one behavior per test. Avoid end-to-end scenarios that chain many LLM calls when a single-turn test suffices.
- Run selectively: Use the test separation from above to run integration tests only in CI or before deploy, not on every file save.
Next steps
Learn how to evaluate agent trajectories with deterministic matching or LLM-as-judge evaluators in Evals.Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

