Implement a CI/CD pipeline using LangSmith Deployment and Evaluation

This guide demonstrates how to implement a comprehensive CI/CD pipeline for AI agent applications deployed in LangSmith Deployment. In this example, you’ll use the LangGraph open source framework for orchestrating and building the agent, LangSmith for observability and evaluations. This pipeline is based on the cicd-pipeline-example repository.

Overview

The CI/CD pipeline provides:

Automated testing: Unit, integration, and end-to-end tests.
Offline evaluations: Performance assessment using AgentEvals, OpenEvals and LangSmith.
Preview and production deployments: Automated staging and quality-gated production releases using the Control Plane API.
Monitoring: Continuous evaluation and alerting.

Pipeline architecture

The CI/CD pipeline consists of several key components that work together to ensure code quality and reliable deployments:

Trigger sources

There are multiple ways you can trigger this pipeline, either during development or if your application is already live. The pipeline can be triggered by:

Code changes: Pushes to main/development branches where you can modify the LangGraph architecture, try different models, update agent logic, or make any code improvements.
PromptHub updates: Changes to prompt templates stored in LangSmith PromptHub—whenever there’s a new prompt commit, the system triggers a webhook to run the pipeline.
Online evaluation alerts: Performance degradation notifications from live deployments
LangSmith traces webhooks: Automated triggers based on trace analysis and performance metrics.
Manual trigger: Manual initiation of the pipeline for testing or emergency deployments.

Testing layers

Compared to traditional software, testing AI agent applications also requires assessing response quality, so it is important to test each part of the workflow. The pipeline implements multiple testing layers:

Unit tests: Individual node and utility function testing.
Integration tests: Component interaction testing.
End-to-end tests: Full graph execution testing.
Offline evaluations: Performance assessment with real-world scenarios including end-to-end evaluations, single-step evaluations, agent trajectory analysis, and multi-turn simulations.
LangGraph dev server tests: Use the langgraph-cli tool for spinning up (inside the GitHub Action) a local server to run the LangGraph agent. This polls the /ok server API endpoint until it is available and for 30 seconds, after that it throws an error.

GitHub actions workflow

The CI/CD pipeline uses GitHub Actions with the Control Plane API and LangSmith API to automate deployment. A helper script manages API interactions and deployments: https://github.com/langchain-ai/cicd-pipeline-example/blob/main/.github/scripts/langgraph_api.py. The workflow includes:

New agent deployment: When a new PR is opened and tests pass, a new preview deployment is created in LangSmith Deployment using the Control Plane API. This allows you to test the agent in a staging environment before promoting to production.
Agent deployment revision: A revision happens when an existing deployment with the same ID is found, or when the PR is merged into main. In the case of merging to main, the preview deployment is deleted and a production deployment is created. This ensures that any updates to the agent are properly deployed and integrated into the production infrastructure.
Testing and evaluation workflow: In addition to the more traditional testing phases (unit tests, integration tests, end-to-end tests, etc.), the pipeline includes offline evaluations and Agent dev server testing because you want to test the quality of your agent. These evaluations provide comprehensive assessment of the agent’s performance using real-world scenarios and data.
Final Response Evaluation
Evaluates the final output of your agent against expected results. This is the most common type of evaluation that checks if the agent’s final response meets quality standards and answers the user’s question correctly.
Single Step Evaluation
Tests individual steps or nodes within your LangGraph workflow. This allows you to validate specific components of your agent’s logic in isolation, ensuring each step functions correctly before testing the full pipeline.
Agent Trajectory Evaluation
Analyzes the complete path your agent takes through the graph, including all intermediate steps and decision points. This helps identify bottlenecks, unnecessary steps, or suboptimal routing in your agent’s workflow. It also evaluates whether your agent invoked the right tools in the right order or at the right time.
Multi-Turn Evaluation
Tests conversational flows where the agent maintains context across multiple interactions. This is crucial for agents that handle follow-up questions, clarifications, or extended dialogues with users.
See the LangGraph testing documentation for specific testing approaches and the evaluation approaches guide for a comprehensive overview of offline evaluations.

Prerequisites

Before setting up the CI/CD pipeline, ensure you have:

An AI agent application (in this case built using LangGraph)
A LangSmith account
A LangSmith API key needed to deploy agents and retrieve experiment results
Project-specific environment variables configured in your repository secrets (e.g., LLM model API keys, vector store credentials, database connections)

While this example uses GitHub, the CI/CD pipeline works with other Git hosting platforms including GitLab, Bitbucket, and others.

Deployment options

LangSmith supports multiple deployment methods, depending on how your LangSmith instance is hosted:

Cloud LangSmith: Direct GitHub integration.
Self-Hosted/Hybrid: Container registry-based deployments.

The deployment flow starts by modifying your agent implementation. At minimum, you must have a langgraph.json and dependency file in your project (requirements.txt or pyproject.toml). Use the langgraph dev CLI tool to check for errors—fix any errors; otherwise, the deployment will succeed when deployed to LangSmith Deployment.

Prerequisites for manual deployment

Before deploying your agent, ensure you have:

LangGraph graph: Your agent implementation (e.g., ./agents/simple_text2sql.py:agent).
Dependencies: Either requirements.txt or pyproject.toml with all required packages.
Configuration: langgraph.json file specifying:
- Path to your agent graph
- Dependencies location
- Environment variables
- Python version

Example langgraph.json:

{
    "graphs": {
        "simple_text2sql": "./agents/simple_text2sql.py:agent"
    },
    "env": ".env",
    "python_version": "3.11",
    "dependencies": ["."],
    "image_distro": "wolfi"
}

Local development and testing

First, test your agent locally using Studio:

# Start local development server with Studio
langgraph dev

This will:

Spin up a local server with Studio.
Allow you to visualize and interact with your graph.
Validate that your agent works correctly before deployment.

If your agent runs locally without any errors, it means that deployment to LangSmith will likely succeed. This local testing helps catch configuration issues, dependency problems, and agent logic errors before attempting deployment.

See the LangGraph CLI documentation for more details.

Method 1: LangSmith Deployment UI

Deploy your agent using the LangSmith deployment interface:

Go to your LangSmith dashboard.
Navigate to the Deployments section.
Click the + New Deployment button in the top right.
Select your GitHub repository containing your LangGraph agent from the dropdown menu.

Supported deployments:

Cloud LangSmith: Direct GitHub integration with dropdown menu
Self-Hosted/Hybrid LangSmith: Specify your image URI in the Image Path field (e.g., docker.io/username/my-agent:latest)

Benefits:

Simple UI-based deployment
Direct integration with your GitHub repository (cloud)
No manual Docker image management required (cloud)

Method 2: Control plane API

Deploy using the Control Plane API with different approaches for each deployment type: For Cloud LangSmith:

Use the Control Plane API to create deployments by pointing to your GitHub repository
No Docker image building required for cloud deployments

For Self-Hosted/Hybrid LangSmith:

# Build Docker image
langgraph build -t my-agent:latest

# Push to your container registry
docker push my-agent:latest

You can push to any container registry (Docker Hub, AWS ECR, Azure ACR, Google GCR, etc.) that your deployment environment has access to. Supported deployments:

Cloud LangSmith: Use the Control Plane API to create deployments from your GitHub repository
Self-Hosted/Hybrid LangSmith: Use the Control Plane API to create deployments from your container registry

See the LangGraph CLI build documentation for more details.

Connect to your deployed Agent

LangGraph SDK: Use the LangGraph SDK for programmatic integration.
RemoteGraph: Connect using RemoteGraph for remote graph connections (to use your graph in other graphs).
REST API: Use HTTP-based interactions with your deployed agent.
Studio: Access the visual interface for testing and debugging.

Environment configuration

Database & cache configuration

By default, LangSmith Deployment create PostgreSQL and Redis instances for you. To use external services, set the following environment variables in your new deployment or revision:

# Set environment variables for external services
export POSTGRES_URI_CUSTOM="postgresql://user:pass@host:5432/db"
export REDIS_URI_CUSTOM="redis://host:6379/0"

See the environment variables documentation for more details.

Troubleshooting

Wrong API endpoints

If you’re experiencing connection issues, verify you’re using the correct endpoint format for your LangSmith instance. There are two different APIs with different endpoints:

LangSmith API (Traces, ingestion, etc.)

For LangSmith API operations (traces, evaluations, datasets):

Region	Endpoint
US	`https://api.smith.langchain.com`
EU	`https://eu.api.smith.langchain.com`

For self-hosted LangSmith instances, use http(s)://<langsmith-url>/api where <langsmith-url> is your self-hosted instance URL.

If you’re setting the endpoint in the LANGSMITH_ENDPOINT environment variable, you need to add /v1 at the end (e.g., https://api.smith.langchain.com/v1 or http(s)://<langsmith-url>/api/v1 if self-hosted).

LangSmith Deployment API (Deployments)

For LangSmith Deployment operations (deployments, revisions):

Region	Endpoint
US	`https://api.host.langchain.com`
EU	`https://eu.api.host.langchain.com`

For self-hosted LangSmith instances, use http(s)://<langsmith-url>/api-host where <langsmith-url> is your self-hosted instance URL.

Edit this page on GitHub or file an issue.

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

Agent server

Core capabilities

Configure app for deployment

Deployment guides

Studio

Auth & access control

Server customization

Implement a CI/CD pipeline using LangSmith Deployment and Evaluation

Overview

Pipeline architecture

Trigger sources

Testing layers

GitHub actions workflow

Prerequisites

Deployment options

Prerequisites for manual deployment

Local development and testing

Method 1: LangSmith Deployment UI

Method 2: Control plane API

Connect to your deployed Agent

Environment configuration

Database & cache configuration

Troubleshooting

Wrong API endpoints

LangSmith API (Traces, ingestion, etc.)

LangSmith Deployment API (Deployments)

Agent server

Core capabilities

Configure app for deployment

Deployment guides

Studio

Auth & access control

Server customization

​Overview

​Pipeline architecture

​Trigger sources

​Testing layers

​GitHub actions workflow

​Prerequisites

​Deployment options

​Prerequisites for manual deployment

​Local development and testing

​Method 1: LangSmith Deployment UI

​Method 2: Control plane API

​Connect to your deployed Agent

​Environment configuration

​Database & cache configuration

​Troubleshooting

​Wrong API endpoints

​LangSmith API (Traces, ingestion, etc.)

​LangSmith Deployment API (Deployments)

Overview

Pipeline architecture

Trigger sources

Testing layers

GitHub actions workflow

Prerequisites

Deployment options

Prerequisites for manual deployment

Local development and testing

Method 1: LangSmith Deployment UI

Method 2: Control plane API

Connect to your deployed Agent

Environment configuration

Database & cache configuration

Troubleshooting

Wrong API endpoints

LangSmith API (Traces, ingestion, etc.)

LangSmith Deployment API (Deployments)