- Unit tests exercise small, deterministic pieces of your agent in isolation using in-memory fakes so you can assert exact behavior quickly and deterministically.
- Integration tests test the agent using real network calls to confirm that components work together, credentials and schemas line up, and latency is acceptable.
- Evals use evaluators to assess your agent’s execution trajectory, either via deterministic matching or an LLM judge.
Unit testing
Mock chat models and use in-memory persistence to test agent logic without API calls.
Integration testing
Test your agent with real LLM APIs. Organize tests, manage keys, handle flakiness, and control costs.
Evals
Evaluate agent trajectories with deterministic matching or LLM-as-judge evaluators.
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

