langsmith Harbor environment runs those trials on LangSmith sandboxes. Select it with -e langsmith to execute Harbor jobs on LangSmith infrastructure, alongside providers such as Daytona, Modal, and E2B.
Prerequisites
- A LangSmith account and an API key.
- Python with
pip.
Install
Install Harbor with thelangsmith extra:
Authenticate
Harbor authenticates with your LangSmith credentials. Set an API key:LANGCHAIN_API_KEY works as well. Alternatively, select a LangSmith SDK profile instead of exporting a key:
Run an evaluation
Run a Harbor job and select the LangSmith environment with-e langsmith:
Configure the sandbox environment
The LangSmith environment boots each sandbox from a filesystem snapshot. Provide one of the following in your Harbor task:- Prebuilt image: set
[environment].docker_imageintask.toml. Harbor reuses or creates a snapshot from that image. - Existing snapshot: pass
environment.kwargs.snapshot_nameto boot from a snapshot you already created. - Dockerfile: include an
environment/Dockerfile. Harbor builds a snapshot from it with the build-from-Dockerfile flow, using the taskenvironment/directory as the build context.
--ek:
idle_ttl_seconds: stops an idle sandbox after this many seconds. Set0to disable the idle timeout.delete_after_stop_seconds: deletes a stopped sandbox after this many seconds.
Run Deep Agents on LangSmith
Deep Agents runs against the LangSmith environment as a custom Harbor agent. To build and run a Deep Agent, see the Deep Agents documentation. The Harbor wrapper ships in thedeepagents-evals package, which exposes deepagents_harbor:DeepAgentsWrapper and includes ready-made make run-terminal-bench-* targets. Install it in the same environment as Harbor:
Use a config file
Capture the same run in a Harbor job config:Multi-container tasks
The LangSmith environment supports multi-container tasks. Include anenvironment/docker-compose.yaml file in your task definition to run several containers per trial. See the Harbor sandbox documentation for details.
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

