Self-hosted LangSmith on GCP - Docs by LangChain

When running LangSmith on Google Cloud Platform (GCP), you can set up in either full self-hosted or hybrid mode. Full self-hosted mode deploys a complete LangSmith platform with observability functionality as well as the option to create agent deployments. Hybrid mode entails just the infrastructure to run agents in a data plane within your cloud, while our SaaS provides the control plane and observability functionality. This page provides:

Initial setup steps for deploying to GKE, configuring managed services, and setting up authentication.
GCP-specific architecture patterns and reference diagrams.
Service recommendations and best practices.
Google Cloud Well-Architected best practices for operational excellence, security, and reliability.

LangChain provides Terraform modules specifically for GCP to help provision infrastructure for LangSmith. These modules can quickly set up GKE clusters, Cloud SQL, Memorystore Redis, Cloud Storage, and networking resources.View the GCP Terraform modules for documentation and examples.

Initial setup

Deploy to Kubernetes

Follow the Kubernetes installation guide. LangSmith is tested on Google Kubernetes Engine (GKE).GKE-specific notes:

LangSmith works with standard GKE clusters
Use GCE persistent disk storage class

Configure external services

For production deployments, connect to GCP managed services:

Google Cloud Storage

Store trace data in GCS

Cloud SQL

PostgreSQL database

Memorystore

Redis for caching

ClickHouse Cloud

Analytics database

Set up authentication

Use Workload Identity to authenticate LangSmith pods to GCP services.Key pages:

After completing these initial setup steps, you can review the complete GCP architecture and best practices below.

Reference architecture

We recommend leveraging GCP’s managed services to provide a scalable, secure, and resilient platform. The following architecture applies to both self-hosted and hybrid and aligns with the Google Cloud Well-Architected Framework:

Architecture diagram showing GCP relations to LangSmith services

Ingress & networking: Requests enter via Cloud Load Balancing within your VPC, secured using Cloud Armor and IAM-based authentication.
Frontend & backend services: Containers run on Google Kubernetes Engine (GKE), orchestrated behind the load balancer. Routes requests to other services within the cluster as necessary.
Storage & databases:
- Cloud SQL for PostgreSQL: metadata, projects, users, and short-term and long-term memory for deployed agents. LangSmith supports PostgreSQL version 14 or higher.
- Memorystore for Redis: caching and job queues. Memorystore can be in single-instance or cluster mode, running Redis OSS version 5 or higher.
- ClickHouse + Persistent Disks: analytics and trace storage.
  - We recommend using an externally managed ClickHouse solution unless security or compliance reasons prevent you from doing so.
  - ClickHouse is not required for hybrid deployments.
- Cloud Storage: object storage for trace artifacts and telemetry.
LLM integration: Optionally proxy requests to Vertex AI for LLM inference.
Monitoring & observability: Integrate with Cloud Monitoring and Cloud Logging

Compute options

LangSmith supports multiple compute options depending on your requirements:

Compute option	Description	Suitable for
Google Kubernetes Engine (preferred)	Advanced scaling and multi-tenant support	Large enterprises
Compute Engine-based	Full control, BYO-infra	Regulated or air-gapped environments

Google cloud Well-Architected best practices

This reference is designed to align with the six pillars of the Google Cloud Well-Architected Framework:

Operational excellence

Automate deployments with IaC (Terraform / Deployment Manager).
Use Secret Manager for configuration and sensitive data.
Configure your LangSmith instance to export telemetry data and continuously monitor via Cloud Logging.
The preferred method to manage LangSmith deployments is to create a CI process that builds Agent Server images and pushes them to Artifact Registry. Create a test deployment for pull requests before deploying a new revision to staging or production upon PR merge.

Security

Use IAM roles with least-privilege policies and Workload Identity for secure pod-to-GCP-service authentication.
Enable encryption at rest (Cloud SQL, Cloud Storage, Persistent Disks) and in transit (TLS 1.2+).
Integrate with Secret Manager for credentials.
Use Identity Platform or Workload Identity Federation as an IDP in conjunction with LangSmith’s built-in authentication and authorization features to secure access to agents and their tools.

Reliability

Replicate the LangSmith data plane across regions: Deploy identical data planes to Kubernetes clusters in different regions for LangSmith Deployment. Deploy Cloud SQL and GKE services across multiple zones.
Implement autoscaling for backend workers using Horizontal Pod Autoscaler and Cluster Autoscaler.
Use Cloud DNS health checks and failover policies.

Performance optimization

Leverage Compute Engine instances for optimized compute with machine type selection.
Use Cloud Storage lifecycle policies for infrequently accessed trace data, moving to Nearline or Coldline storage classes.

Cost optimization

Right-size GKE clusters using Committed Use Discounts and Sustained Use Discounts.
Monitor cost KPIs using Cloud Billing dashboards and Cost Management tools.

Sustainability

Minimize idle workloads with on-demand compute and autoscaling.
Store telemetry in low-latency, low-cost tiers using Cloud Storage lifecycle policies.
Enable auto-shutdown for non-prod environments using scheduled actions.

Security and compliance

LangSmith can be configured for:

Private Service Connect-only access (no public internet exposure, besides egress necessary for billing).
Cloud KMS-based encryption keys for Cloud Storage, Cloud SQL, and Persistent Disks.
Audit logging to Cloud Logging and Cloud Audit Logs.

Customers can deploy in Assured Workloads regions for compliance with ISO, HIPAA, or other regulatory requirements as needed.

Monitoring and evals

Use LangSmith to:

Capture traces from LLM apps running on Vertex AI.
Evaluate model outputs via LangSmith datasets.
Track latency, token usage, and success rates.

Integrate with:

Cloud Monitoring dashboards.
OpenTelemetry and Prometheus exporters.

Edit this page on GitHub or file an issue.

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

Overview

Hybrid

Self-hosted

​Initial setup

Google Cloud Storage

Cloud SQL

Memorystore

ClickHouse Cloud

​Reference architecture

​Compute options

​Google cloud Well-Architected best practices

​Operational excellence

​Security

​Reliability

​Performance optimization

​Cost optimization

​Sustainability

​Security and compliance

​Monitoring and evals

Initial setup

Reference architecture

Compute options

Google cloud Well-Architected best practices

Operational excellence

Security

Reliability

Performance optimization

Cost optimization

Sustainability

Security and compliance

Monitoring and evals