# null Source: https://docs.langchain.com/index

Documentation

LangChain is the platform for agent engineering. AI teams at Replit, Clay, Rippling, Cloudflare, Workday, and more trust LangChain's products to engineer reliable agents.

Open source agent frameworks

Quickly get started building agents, with any model provider of your choice. Control every step of your custom agent with low-level orchestration, memory, and human-in-the-loop support. Build agents that can tackle complex, multi-step tasks. Quickly get started building agents, with any model provider of your choice. Control every step of your custom agent with low-level orchestration, memory, and human-in-the-loop support. Build agents that can tackle complex, multi-step tasks.

LangSmith

[**LangSmith**](/langsmith/home) is a platform that helps AI teams use live production data for continuous testing and improvement. LangSmith provides: See exactly how your agent thinks and acts with detailed tracing and aggregate trend metrics. Test and score agent behavior on production data or offline datasets to continuously improve performance. Iterate on prompts with version control, prompt optimization, and collaboration features. Ship your agent in one click, using scalable infrastructure built for long-running tasks. LangSmith meets the highest standards of data security and privacy with HIPAA, SOC 2 Type 2, and GDPR compliance. For more information, see the [Trust Center](https://trust.langchain.com/).

Get started

*** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/index.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Connect an authentication provider Source: https://docs.langchain.com/langsmith/add-auth-server In [the last tutorial](/langsmith/resource-auth), you added resource authorization to give users private conversations. However, you are still using hard-coded tokens for authentication, which is not secure. Now you'll replace those tokens with real user accounts using [OAuth2](/langsmith/deployment-quickstart). You'll keep the same [`Auth`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth) object and [resource-level access control](/langsmith/auth#single-owner-resources), but upgrade authentication to use Supabase as your identity provider. While Supabase is used in this tutorial, the concepts apply to any OAuth2 provider. You'll learn how to: 1. Replace test tokens with real JWT tokens 2. Integrate with OAuth2 providers for secure user authentication 3. Handle user sessions and metadata while maintaining our existing authorization logic ## Background OAuth2 involves three main roles: 1. **Authorization server**: The identity provider (e.g., Supabase, Auth0, Google) that handles user authentication and issues tokens 2. **Application backend**: Your LangGraph application. This validates tokens and serves protected resources (conversation data) 3. **Client application**: The web or mobile app where users interact with your service A standard OAuth2 flow works something like this: ```mermaid theme={null} sequenceDiagram participant User participant Client participant AuthServer participant LangGraph Backend User->>Client: Initiate login User->>AuthServer: Enter credentials AuthServer->>Client: Send tokens Client->>LangGraph Backend: Request with token LangGraph Backend->>AuthServer: Validate token AuthServer->>LangGraph Backend: Token valid LangGraph Backend->>Client: Serve request (e.g., run agent or graph) ``` ## Prerequisites Before you start this tutorial, ensure you have: * The [bot from the second tutorial](/langsmith/resource-auth) running without errors. * A [Supabase project](https://supabase.com/dashboard) to use its authentication server. ## 1. Install dependencies Install the required dependencies. Start in your `custom-auth` directory and ensure you have the `langgraph-cli` installed: ```bash pip theme={null} cd custom-auth pip install -U "langgraph-cli[inmem]" ``` ```bash uv theme={null} cd custom-auth uv add langgraph-cli[inmem] ``` ## 2. Set up the authentication provider Next, fetch the URL of your auth server and the private key for authentication. Since you're using Supabase for this, you can do this in the Supabase dashboard: 1. In the left sidebar, click on t️⚙ Project Settings" and then click "API" 2. Copy your project URL and add it to your `.env` file ```shell theme={null} echo "SUPABASE_URL=your-project-url" >> .env ``` 3. Copy your service role secret key and add it to your `.env` file: ```shell theme={null} echo "SUPABASE_SERVICE_KEY=your-service-role-key" >> .env ``` 4. Copy your "anon public" key and note it down. This will be used later when you set up our client code. ```bash theme={null} SUPABASE_URL=your-project-url SUPABASE_SERVICE_KEY=your-service-role-key ``` ## 3. Implement token validation In the previous tutorials, you used the [`Auth`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth) object to [validate hard-coded tokens](/langsmith/set-up-custom-auth) and [add resource ownership](/langsmith/resource-auth). Now you'll upgrade your authentication to validate real JWT tokens from Supabase. The main changes will all be in the [`@auth.authenticate`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth.authenticate) decorated function: * Instead of checking against a hard-coded list of tokens, you'll make an HTTP request to Supabase to validate the token. * You'll extract real user information (ID, email) from the validated token. * The existing resource authorization logic remains unchanged. Update `src/security/auth.py` to implement this: ```python {highlight={8-9,20-30}} title="src/security/auth.py" theme={null} import os import httpx from langgraph_sdk import Auth auth = Auth() # This is loaded from the `.env` file you created above SUPABASE_URL = os.environ["SUPABASE_URL"] SUPABASE_SERVICE_KEY = os.environ["SUPABASE_SERVICE_KEY"] @auth.authenticate async def get_current_user(authorization: str | None): """Validate JWT tokens and extract user information.""" assert authorization scheme, token = authorization.split() assert scheme.lower() == "bearer" try: # Verify token with auth provider async with httpx.AsyncClient() as client: response = await client.get( f"{SUPABASE_URL}/auth/v1/user", headers={ "Authorization": authorization, "apiKey": SUPABASE_SERVICE_KEY, }, ) assert response.status_code == 200 user = response.json() return { "identity": user["id"], # Unique user identifier "email": user["email"], "is_authenticated": True, } except Exception as e: raise Auth.exceptions.HTTPException(status_code=401, detail=str(e)) # ... the rest is the same as before # Keep our resource authorization from the previous tutorial @auth.on async def add_owner(ctx, value): """Make resources private to their creator using resource metadata.""" filters = {"owner": ctx.user.identity} metadata = value.setdefault("metadata", {}) metadata.update(filters) return filters ``` The most important change is that we're now validating tokens with a real authentication server. Our authentication handler has the private key for our Supabase project, which we can use to validate the user's token and extract their information. ## 4. Test authentication flow Let's test out the new authentication flow. You can run the following code in a file or notebook. You will need to provide: * A valid email address * A Supabase project URL (from [above](#setup-auth-provider)) * A Supabase anon **public key** (also from [above](#setup-auth-provider)) ```python theme={null} import os import httpx from getpass import getpass from langgraph_sdk import get_client # Get email from command line email = getpass("Enter your email: ") base_email = email.split("@") password = "secure-password" # CHANGEME email1 = f"{base_email[0]}+1@{base_email[1]}" email2 = f"{base_email[0]}+2@{base_email[1]}" SUPABASE_URL = os.environ.get("SUPABASE_URL") if not SUPABASE_URL: SUPABASE_URL = getpass("Enter your Supabase project URL: ") # This is your PUBLIC anon key (which is safe to use client-side) # Do NOT mistake this for the secret service role key SUPABASE_ANON_KEY = os.environ.get("SUPABASE_ANON_KEY") if not SUPABASE_ANON_KEY: SUPABASE_ANON_KEY = getpass("Enter your public Supabase anon key: ") async def sign_up(email: str, password: str): """Create a new user account.""" async with httpx.AsyncClient() as client: response = await client.post( f"{SUPABASE_URL}/auth/v1/signup", json={"email": email, "password": password}, headers={"apiKey": SUPABASE_ANON_KEY}, ) assert response.status_code == 200 return response.json() # Create two test users print(f"Creating test users: {email1} and {email2}") await sign_up(email1, password) await sign_up(email2, password) ``` ⚠️ Before continuing: Check your email and click both confirmation links. Supabase will reject `/login` requests until after you have confirmed your users' email. Now test that users can only see their own data. Make sure the server is running (run `langgraph dev`) before proceeding. The following snippet requires the "anon public" key that you copied from the Supabase dashboard while [setting up the auth provider](#setup-auth-provider) previously. ```python theme={null} async def login(email: str, password: str): """Get an access token for an existing user.""" async with httpx.AsyncClient() as client: response = await client.post( f"{SUPABASE_URL}/auth/v1/token?grant_type=password", json={ "email": email, "password": password }, headers={ "apikey": SUPABASE_ANON_KEY, "Content-Type": "application/json" }, ) assert response.status_code == 200 return response.json()["access_token"] # Log in as user 1 user1_token = await login(email1, password) user1_client = get_client( url="http://localhost:2024", headers={"Authorization": f"Bearer {user1_token}"} ) # Create a thread as user 1 thread = await user1_client.threads.create() print(f"✅ User 1 created thread: {thread['thread_id']}") # Try to access without a token unauthenticated_client = get_client(url="http://localhost:2024") try: await unauthenticated_client.threads.create() print("❌ Unauthenticated access should fail!") except Exception as e: print("✅ Unauthenticated access blocked:", e) # Try to access user 1's thread as user 2 user2_token = await login(email2, password) user2_client = get_client( url="http://localhost:2024", headers={"Authorization": f"Bearer {user2_token}"} ) try: await user2_client.threads.get(thread["thread_id"]) print("❌ User 2 shouldn't see User 1's thread!") except Exception as e: print("✅ User 2 blocked from User 1's thread:", e) ``` The output should look like this: ```shell theme={null} ✅ User 1 created thread: d6af3754-95df-4176-aa10-dbd8dca40f1a ✅ Unauthenticated access blocked: Client error '403 Forbidden' for url 'http://localhost:2024/threads' ✅ User 2 blocked from User 1's thread: Client error '404 Not Found' for url 'http://localhost:2024/threads/d6af3754-95df-4176-aa10-dbd8dca40f1a' ``` Your authentication and authorization are working together: 1. Users must log in to access the bot 2. Each user can only see their own threads All users are managed by the Supabase auth provider, so you don't need to implement any additional user management logic. ## Next steps You've successfully built a production-ready authentication system for your LangGraph application! Let's review what you've accomplished: 1. Set up an authentication provider (Supabase in this case) 2. Added real user accounts with email/password authentication 3. Integrated JWT token validation into your Agent Server 4. Implemented proper authorization to ensure users can only access their own data 5. Created a foundation that's ready to handle your next authentication challenge 🚀 Now that you have production authentication, consider: 1. Building a web UI with your preferred framework (see the [Custom Auth](https://github.com/langchain-ai/custom-auth) template for an example) 2. Learn more about the other aspects of authentication and authorization in the [conceptual guide on authentication](/langsmith/auth). 3. Customize your handlers and setup further after reading the [reference docs](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth). *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/add-auth-server.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Human-in-the-loop using server API Source: https://docs.langchain.com/langsmith/add-human-in-the-loop To review, edit, and approve tool calls in an agent or workflow, use LangGraph's [human-in-the-loop](/oss/python/langgraph/interrupts) features. ## Dynamic interrupts ```python {highlight={2,34}} theme={null} from langgraph_sdk import get_client from langgraph_sdk.schema import Command client = get_client(url=) # Using the graph deployed with the name "agent" assistant_id = "agent" # create a thread thread = await client.threads.create() thread_id = thread["thread_id"] # Run the graph until the interrupt is hit. result = await client.runs.wait( thread_id, assistant_id, input={"some_text": "original text"} # (1)! ) print(result['__interrupt__']) # (2)! # > [ # > { # > 'value': {'text_to_revise': 'original text'}, # > 'resumable': True, # > 'ns': ['human_node:fc722478-2f21-0578-c572-d9fc4dd07c3b'], # > 'when': 'during' # > } # > ] # Resume the graph print(await client.runs.wait( thread_id, assistant_id, command=Command(resume="Edited text") # (3)! )) # > {'some_text': 'Edited text'} ``` 1. The graph is invoked with some initial state. 2. When the graph hits the interrupt, it returns an interrupt object with the payload and metadata. 3\. The graph is resumed with a `Command(resume=...)`, injecting the human's input and continuing execution. ```javascript {highlight={32}} theme={null} import { Client } from "@langchain/langgraph-sdk"; const client = new Client({ apiUrl: }); // Using the graph deployed with the name "agent" const assistantID = "agent"; // create a thread const thread = await client.threads.create(); const threadID = thread["thread_id"]; // Run the graph until the interrupt is hit. const result = await client.runs.wait( threadID, assistantID, { input: { "some_text": "original text" } } # (1)! ); console.log(result['__interrupt__']); # (2)! // > [ # > { # > 'value': {'text_to_revise': 'original text'}, # > 'resumable': True, # > 'ns': ['human_node:fc722478-2f21-0578-c572-d9fc4dd07c3b'], # > 'when': 'during' # > } # > ] // Resume the graph console.log(await client.runs.wait( threadID, assistantID, { command: { resume: "Edited text" }} # (3)! )); # > {'some_text': 'Edited text'} ``` 1. The graph is invoked with some initial state. 2. When the graph hits the interrupt, it returns an interrupt object with the payload and metadata. 3. The graph is resumed with a `{ resume: ... }` command object, injecting the human's input and continuing execution. Create a thread: ```bash theme={null} curl --request POST \ --url /threads \ --header 'Content-Type: application/json' \ --data '{}' ``` Run the graph until the interrupt is hit.: ```bash theme={null} curl --request POST \ --url /threads//runs/wait \ --header 'Content-Type: application/json' \ --data "{ \"assistant_id\": \"agent\", \"input\": {\"some_text\": \"original text\"} }" ``` Resume the graph: ```bash theme={null} curl --request POST \ --url /threads//runs/wait \ --header 'Content-Type: application/json' \ --data "{ \"assistant_id\": \"agent\", \"command\": { \"resume\": \"Edited text\" } }" ``` This is an example graph you can run in the Agent Server. See [LangSmith quickstart](/langsmith/deployment-quickstart) for more details. ```python {highlight={7,13}} theme={null} from typing import TypedDict import uuid from langgraph.checkpoint.memory import InMemorySaver from langgraph.constants import START from langgraph.graph import StateGraph from langgraph.types import interrupt, Command class State(TypedDict): some_text: str def human_node(state: State): value = interrupt( # (1)! { "text_to_revise": state["some_text"] # (2)! } ) return { "some_text": value # (3)! } # Build the graph graph_builder = StateGraph(State) graph_builder.add_node("human_node", human_node) graph_builder.add_edge(START, "human_node") graph = graph_builder.compile() ``` 1. `interrupt(...)` pauses execution at `human_node`, surfacing the given payload to a human. 2. Any JSON serializable value can be passed to the [`interrupt`](https://reference.langchain.com/python/langgraph/types/#langgraph.types.interrupt) function. Here, a dict containing the text to revise. 3. Once resumed, the return value of `interrupt(...)` is the human-provided input, which is used to update the state. Once you have a running Agent Server, you can interact with it using [LangGraph SDK](/langsmith/langgraph-python-sdk) ```python {highlight={2,34}} theme={null} from langgraph_sdk import get_client from langgraph_sdk.schema import Command client = get_client(url=) # Using the graph deployed with the name "agent" assistant_id = "agent" # create a thread thread = await client.threads.create() thread_id = thread["thread_id"] # Run the graph until the interrupt is hit. result = await client.runs.wait( thread_id, assistant_id, input={"some_text": "original text"} # (1)! ) print(result['__interrupt__']) # (2)! # > [ # > { # > 'value': {'text_to_revise': 'original text'}, # > 'resumable': True, # > 'ns': ['human_node:fc722478-2f21-0578-c572-d9fc4dd07c3b'], # > 'when': 'during' # > } # > ] # Resume the graph print(await client.runs.wait( thread_id, assistant_id, command=Command(resume="Edited text") # (3)! )) # > {'some_text': 'Edited text'} ``` 1. The graph is invoked with some initial state. 2. When the graph hits the interrupt, it returns an interrupt object with the payload and metadata. 3\. The graph is resumed with a `Command(resume=...)`, injecting the human's input and continuing execution. ```javascript {highlight={32}} theme={null} import { Client } from "@langchain/langgraph-sdk"; const client = new Client({ apiUrl: }); // Using the graph deployed with the name "agent" const assistantID = "agent"; // create a thread const thread = await client.threads.create(); const threadID = thread["thread_id"]; // Run the graph until the interrupt is hit. const result = await client.runs.wait( threadID, assistantID, { input: { "some_text": "original text" } } # (1)! ); console.log(result['__interrupt__']); # (2)! # > [ # > { # > 'value': {'text_to_revise': 'original text'}, # > 'resumable': True, # > 'ns': ['human_node:fc722478-2f21-0578-c572-d9fc4dd07c3b'], # > 'when': 'during' # > } # > ] // Resume the graph console.log(await client.runs.wait( threadID, assistantID, { command: { resume: "Edited text" }} # (3)! )); # > {'some_text': 'Edited text'} ``` 1. The graph is invoked with some initial state. 2. When the graph hits the interrupt, it returns an interrupt object with the payload and metadata. 3. The graph is resumed with a `{ resume: ... }` command object, injecting the human's input and continuing execution. Create a thread: ```bash theme={null} curl --request POST \ --url /threads \ --header 'Content-Type: application/json' \ --data '{}' ``` Run the graph until the interrupt is hit: ```bash theme={null} curl --request POST \ --url /threads//runs/wait \ --header 'Content-Type: application/json' \ --data "{ \"assistant_id\": \"agent\", \"input\": {\"some_text\": \"original text\"} }" ``` Resume the graph: ```bash theme={null} curl --request POST \ --url /threads//runs/wait \ --header 'Content-Type: application/json' \ --data "{ \"assistant_id\": \"agent\", \"command\": { \"resume\": \"Edited text\" } }" ``` ## Static interrupts Static interrupts (also known as static breakpoints) are triggered either before or after a node executes. Static interrupts are **not** recommended for human-in-the-loop workflows. They are best used for debugging and testing. You can set static interrupts by specifying `interrupt_before` and `interrupt_after` at compile time: ```python {highlight={1,2,3}} theme={null} graph = graph_builder.compile( # (1)! interrupt_before=["node_a"], # (2)! interrupt_after=["node_b", "node_c"], # (3)! ) ``` 1. The breakpoints are set during `compile` time. 2. `interrupt_before` specifies the nodes where execution should pause before the node is executed. 3. `interrupt_after` specifies the nodes where execution should pause after the node is executed. Alternatively, you can set static interrupts at run time: ```python {highlight={1,5,6}} theme={null} await client.runs.wait( # (1)! thread_id, assistant_id, inputs=inputs, interrupt_before=["node_a"], # (2)! interrupt_after=["node_b", "node_c"] # (3)! ) ``` 1. `client.runs.wait` is called with the `interrupt_before` and `interrupt_after` parameters. This is a run-time configuration and can be changed for every invocation. 2. `interrupt_before` specifies the nodes where execution should pause before the node is executed. 3. `interrupt_after` specifies the nodes where execution should pause after the node is executed. ```javascript {highlight={1,6,7}} theme={null} await client.runs.wait( // (1)! threadID, assistantID, { input: input, interruptBefore: ["node_a"], // (2)! interruptAfter: ["node_b", "node_c"] // (3)! } ) ``` 1. `client.runs.wait` is called with the `interruptBefore` and `interruptAfter` parameters. This is a run-time configuration and can be changed for every invocation. 2. `interruptBefore` specifies the nodes where execution should pause before the node is executed. 3. `interruptAfter` specifies the nodes where execution should pause after the node is executed. ```bash theme={null} curl --request POST \ --url /threads//runs/wait \ --header 'Content-Type: application/json' \ --data "{ \"assistant_id\": \"agent\", \"interrupt_before\": [\"node_a\"], \"interrupt_after\": [\"node_b\", \"node_c\"], \"input\": }" ``` The following example shows how to add static interrupts: ```python theme={null} from langgraph_sdk import get_client client = get_client(url=) # Using the graph deployed with the name "agent" assistant_id = "agent" # create a thread thread = await client.threads.create() thread_id = thread["thread_id"] # Run the graph until the breakpoint result = await client.runs.wait( thread_id, assistant_id, input=inputs # (1)! ) # Resume the graph await client.runs.wait( thread_id, assistant_id, input=None # (2)! ) ``` 1. The graph is run until the first breakpoint is hit. 2. The graph is resumed by passing in `None` for the input. This will run the graph until the next breakpoint is hit. ```js theme={null} import { Client } from "@langchain/langgraph-sdk"; const client = new Client({ apiUrl: }); // Using the graph deployed with the name "agent" const assistantID = "agent"; // create a thread const thread = await client.threads.create(); const threadID = thread["thread_id"]; // Run the graph until the breakpoint const result = await client.runs.wait( threadID, assistantID, { input: input } # (1)! ); // Resume the graph await client.runs.wait( threadID, assistantID, { input: null } # (2)! ); ``` 1. The graph is run until the first breakpoint is hit. 2. The graph is resumed by passing in `null` for the input. This will run the graph until the next breakpoint is hit. Create a thread: ```bash theme={null} curl --request POST \ --url /threads \ --header 'Content-Type: application/json' \ --data '{}' ``` Run the graph until the breakpoint: ```bash theme={null} curl --request POST \ --url /threads//runs/wait \ --header 'Content-Type: application/json' \ --data "{ \"assistant_id\": \"agent\", \"input\": }" ``` Resume the graph: ```bash theme={null} curl --request POST \ --url /threads//runs/wait \ --header 'Content-Type: application/json' \ --data "{ \"assistant_id\": \"agent\" }" ``` ## Learn more * [Human-in-the-loop conceptual guide](/oss/python/langgraph/interrupts): learn more about LangGraph human-in-the-loop features. * [Common patterns](/oss/python/langgraph/interrupts#common-patterns): learn how to implement patterns like approving/rejecting actions, requesting user input, tool call review, and validating human input. *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/add-human-in-the-loop.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Add metadata and tags to traces Source: https://docs.langchain.com/langsmith/add-metadata-tags LangSmith supports sending arbitrary metadata and tags along with traces. Tags are strings that can be used to categorize or label a trace. Metadata is a dictionary of key-value pairs that can be used to store additional information about a trace. Both are useful for associating additional information with a trace, such as the environment in which it was executed, the user who initiated it, or an internal correlation ID. For more information on tags and metadata, see the [Concepts](/langsmith/observability-concepts#tags) page. For information on how to query traces and runs by metadata and tags, see the [Filter traces in the application](/langsmith/filter-traces-in-application) page. ```python Python theme={null} import openai import langsmith as ls from langsmith.wrappers import wrap_openai client = openai.Client() messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello!"} ] # You can set metadata & tags **statically** when decorating a function # Use the @traceable decorator with tags and metadata # Ensure that the LANGSMITH_TRACING environment variables are set for @traceable to work @ls.traceable( run_type="llm", name="OpenAI Call Decorator", tags=["my-tag"], metadata={"my-key": "my-value"} ) def call_openai( messages: list[dict], model: str = "gpt-4o-mini" ) -> str: # You can also dynamically set metadata on the parent run: rt = ls.get_current_run_tree() rt.metadata["some-conditional-key"] = "some-val" rt.tags.extend(["another-tag"]) return client.chat.completions.create( model=model, messages=messages, ).choices[0].message.content call_openai( messages, # To add at **invocation time**, when calling the function. # via the langsmith_extra parameter langsmith_extra={"tags": ["my-other-tag"], "metadata": {"my-other-key": "my-value"}} ) # Alternatively, you can use the context manager with ls.trace( name="OpenAI Call Trace", run_type="llm", inputs={"messages": messages}, tags=["my-tag"], metadata={"my-key": "my-value"}, ) as rt: chat_completion = client.chat.completions.create( model="gpt-4o-mini", messages=messages, ) rt.metadata["some-conditional-key"] = "some-val" rt.end(outputs={"output": chat_completion}) # You can use the same techniques with the wrapped client patched_client = wrap_openai( client, tracing_extra={"metadata": {"my-key": "my-value"}, "tags": ["a-tag"]} ) chat_completion = patched_client.chat.completions.create( model="gpt-4o-mini", messages=messages, langsmith_extra={ "tags": ["my-other-tag"], "metadata": {"my-other-key": "my-value"}, }, ) ``` ```typescript TypeScript theme={null} import OpenAI from "openai"; import { traceable, getCurrentRunTree } from "langsmith/traceable"; import { wrapOpenAI } from "langsmith/wrappers"; const client = wrapOpenAI(new OpenAI()); const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "Hello!" }, ]; const traceableCallOpenAI = traceable( async (messages: OpenAI.Chat.ChatCompletionMessageParam[]) => { const completion = await client.chat.completions.create({ model: "gpt-4o-mini", messages, }); const runTree = getCurrentRunTree(); runTree.extra.metadata = { ...runTree.extra.metadata, someKey: "someValue", }; runTree.tags = [...(runTree.tags ?? []), "runtime-tag"]; return completion.choices[0].message.content; }, { run_type: "llm", name: "OpenAI Call Traceable", tags: ["my-tag"], metadata: { "my-key": "my-value" }, } ); // Call the traceable function await traceableCallOpenAI(messages); ``` *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/add-metadata-tags.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Overview Source: https://docs.langchain.com/langsmith/administration-overview This overview covers topics related to managing users, organizations, and workspaces within LangSmith. ## Resource Hierarchy ### Organizations An organization is a logical grouping of users within LangSmith with its own billing configuration. Typically, there is one organization per company. An organization can have multiple workspaces. For more details, see the [setup guide](/langsmith/set-up-a-workspace#set-up-an-organization). When you log in for the first time, a personal organization will be created for you automatically. If you'd like to collaborate with others, you can create a separate organization and invite your team members to join. There are a few important differences between your personal organization and shared organizations: | Feature | Personal | Shared | | ------------------- | ------------------- | -------------------------------------------------------------------------------------------- | | Maximum workspaces | 1 | Variable, depending on plan (see [pricing page](https://www.langchain.com/pricing-langsmith) | | Collaboration | Cannot invite users | Can invite users | | Billing: paid plans | Developer plan only | All other plans available | ### Workspaces Workspaces were formerly called Tenants. Some code and APIs may still reference the old name for a period of time during the transition. A workspace is a logical grouping of users and resources within an organization. A workspace separates trust boundaries for resources and access control. Users may have permissions in a workspace that grant them access to the resources in that workspace, including tracing projects, datasets, annotation queues, and prompts. For more details, see the [setup guide](/langsmith/set-up-a-workspace). It is recommended to create a separate workspace for each team within your organization. To organize resources even further, you can use [Resource Tags](#resource-tags) to group resources within a workspace. The following image shows a sample workspace settings page: Sample Workspace The following diagram explains the relationship between organizations, workspaces, and the different resources scoped to and within a workspace: Resource Hierarchy See the table below for details on which features are available in which scope (organization or workspace): | Resource/Setting | Scope | | --------------------------------------------------------------------------- | ---------------- | | Trace Projects | Workspace | | Annotation Queues | Workspace | | Deployments | Workspace | | Datasets & Experiments | Workspace | | Prompts | Workspace | | Resource Tags | Workspace | | API Keys | Workspace | | Settings including Secrets, Feedback config, Models, Rules, and Shared URLs | Workspace | | User management: Invite User to Workspace | Workspace | | RBAC: Assigning Workspace Roles | Workspace | | Data Retention, Usage Limits | Workspace\* | | Plans and Billing, Credits, Invoices | Organization | | User management: Invite User to Organization | Organization\*\* | | Adding Workspaces | Organization | | Assigning Organization Roles | Organization | | RBAC: Creating/Editing/Deleting Custom Roles | Organization | \* Data retention settings and usage limits will be available soon for the organization level as well \*\* Self-hosted installations may enable workspace-level invites of users to the organization via a feature flag. See the [self-hosted user management docs](/langsmith/self-host-user-management) for details. ### Resource tags Resource tags allow you to organize resources within a workspace. Each tag is a key-value pair that can be assigned to a resource. Tags can be used to filter workspace-scoped resources in the UI and API: Projects, Datasets, Annotation Queues, Deployments, and Experiments. Each new workspace comes with two default tag keys: `Application` and `Environment`; as the names suggest, these tags can be used to categorize resources based on the application and environment they belong to. More tags can be added as needed. LangSmith resource tags are very similar to tags in cloud services like [AWS](https://docs.aws.amazon.com/tag-editor/latest/userguide/tagging.html). Sample Resource Tags ## User Management and RBAC ### Users A user is a person who has access to LangSmith. Users can be members of one or more organizations and workspaces within those organizations. Organization members are managed in organization settings: Sample Organization Members And workspace members are managed in workspace settings: Sample Workspace Members ### API keys We ended support for legacy API keys prefixed with `ls__` on October 22, 2024 in favor of personal access tokens (PATs) and service keys. We require using PATs and service keys for all new integrations. API keys prefixed with `ls__` will no longer work as of October 22, 2024. #### Expiration Dates When you create an API key, you have the option to set an expiration date. Adding an expiration date to keys enhances security and minimizes the risk of unauthorized access. For example, you may set expiration dates on keys for temporary tasks that require elevated access. By default, keys never expire. Once expired, an API key is no longer valid and cannot be reactivated or have its expiration modified. #### Personal Access Tokens (PATs) Personal Access Tokens (PATs) are used to authenticate requests to the LangSmith API. They are created by users and scoped to a user. The PAT will have the same permissions as the user that created it. We recommend not using these to authenticate requests from your application, but rather using them for personal scripts or tools that interact with the LangSmith API. If the user associated with the PAT is removed from the organization, the PAT will no longer work. PATs are prefixed with `lsv2_pt_` #### Service keys Service keys are similar to PATs, but are used to authenticate requests to the LangSmith API on behalf of a service account. Only admins can create service keys. We recommend using these for applications / services that need to interact with the LangSmith API, such as LangGraph agents or other integrations. Service keys may be scoped to a single workspace, multiple workspaces, or the entire organization, and can be used to authenticate requests to the LangSmith API for whichever workspace(s) it has access to. Service keys are prefixed with `lsv2_sk_` Use the `X-Tenant-Id` header to specify the target workspace. * **When using PATs**: If this header is omitted, requests will run against the default workspace associated with the key. * **When using organization-scoped service keys**: You must include the `X-Tenant-Id` header when accessing workspace-scoped resources. Without it, the request will fail with a `403 Forbidden` error. To see how to create a service key or Personal Access Token, see the [setup guide](/langsmith/create-account-api-key) ### Organization roles Organization roles are distinct from the [Enterprise feature workspace RBAC](#workspace-roles-rbac) and are used in the context of multiple [workspaces](#workspaces). Your organization role determines your workspace membership characteristics and your [organization-level permissions](/langsmith/organization-workspace-operations). The organization role selected also impacts workspace membership as described here: * [Organization Admin](/langsmith/rbac#organization-admin) grants full access to manage all organization configuration, users, billing, and workspaces. * An Organization Admin has `Admin` access to all workspaces in an organization. * [Organization User](/langsmith/rbac#organization-user) may read organization information but cannot execute any write actions at the organization level. An Organization User may create [Personal Access Tokens](#personal-access-tokens-pats). * An Organization User can be added to a subset of workspaces and assigned workspace roles as usual (if RBAC is enabled), which specify permissions at the workspace level. * [Organization Viewer](/langsmith/rbac#organization-viewer) is equivalent to Organization User, but **cannot** create Personal Access Tokens. (for self-hosted, available in Helm chart version 0.11.25+). The Organization User and Organization Viewer roles are only available in organizations on [plans](https://langchain.com/pricing) with multiple workspaces. In organizations limited to a single workspace, all users have the Organization Admin role. See [security settings](/langsmith/manage-organization-by-api#security-settings) for instructions on how to disable PAT creation for the entire organization. For more information on setting up organizations and workspaces, refer to the [organization setup guide](/langsmith/set-up-a-workspace#organization-roles) for more information. The following table provdies an overview of organization level permissions: | | Organization Viewer | Organization User | Organization Admin | | ------------------------------------------- | ------------------- | ----------------- | ------------------ | | View organization configuration | ✅ | ✅ | ✅ | | View organization roles | ✅ | ✅ | ✅ | | View organization members | ✅ | ✅ | ✅ | | View data retention settings | ✅ | ✅ | ✅ | | View usage limits | ✅ | ✅ | ✅ | | Create personal access tokens (PATs) | ❌ | ✅ | ✅ | | Admin access to all workspaces | ❌ | ❌ | ✅ | | Manage billing settings | ❌ | ❌ | ✅ | | Create workspaces | ❌ | ❌ | ✅ | | Create, edit, and delete organization roles | ❌ | ❌ | ✅ | | Invite new users to organization | ❌ | ❌ | ✅ | | Delete user invites | ❌ | ❌ | ✅ | | Remove users from an organization | ❌ | ❌ | ✅ | | Update data retention settings | ❌ | ❌ | ✅ | | Update usage limits | ❌ | ❌ | ✅ | For a comprehensive list of required permissions along with the operations and roles that can perform them, refer to the [Organization and workspace reference](/langsmith/organization-workspace-operations). ### Workspace roles (RBAC) RBAC (Role-Based Access Control) is a feature that is only available to Enterprise customers. If you are interested in this feature, [contact our sales team](https://www.langchain.com/contact-sales). Other plans default to using the Admin role for all users. Roles are used to define the set of permissions that a user has within a workspace. There are three built-in system roles that cannot be edited: * [Workspace Admin](/langsmith/rbac#workspace-admin) has full access to all resources within the workspace. * [Workspace Editor](/langsmith/rbac#workspace-editor) has full permissions except for workspace management (adding/removing users, changing roles, configuring service keys). * [Workspace Viewer](/langsmith/rbac#workspace-viewer) has read-only access to all resources within the workspace. [Organization admins](/langsmith/rbac#organization-admin) can also create/edit custom roles with specific permissions for different resources. Roles can be managed in **Organization Settings** under the **Roles** tab: The Organization members and roles view showing a list of the roles. * For comprehensive documentation on roles and permissions, refer to the [Role-based access control](/langsmith/rbac) guide. * For more details on assigning and creating roles, refer to the [User Management](/langsmith/user-management) guide. * For a comprehensive list of required permissions along with the operations and roles that can perform them, refer to the [Organization and workspace reference](/langsmith/organization-workspace-operations). ## Best Practices ### Environment Separation Use [resource tags](#resource-tags) to organize resources by environment using the default tag key `Environment` and different values for the environment (e.g., `dev`, `staging`, `prod`). We do not recommend using separate workspaces for environment separation because resources cannot be shared across workspaces, which would prevent you from promoting resources (like prompts) between environments. **Resource tags vs. commit tags for prompt management** While both types of tags can use environment terminology like `dev`, `staging`, and `prod`, they serve different purposes: * **Resource tags** (`Environment: prod`): Use these to *organize and filter* resources across your workspace. Apply resource tags to tracing projects, datasets, and other resources (including prompts) to group them by environment, which enables filtering in the UI. * [Commit tags](/langsmith/manage-prompts#commit-tags) (`prod` tag): Use these to manage which [prompt version](/langsmith/prompt-engineering) your code references. Commit tags are labels that point to specific commits in a prompt's history. When your code pulls a prompt by tag name (e.g., `client.pull_prompt("prompt-name:prod")`), it retrieves whichever commit that tag currently points to. To promote a prompt from `staging` to `prod`, move the commit tag to point to the desired version. Resource tags organize **which resources** belong to an environment. Commit tags let you control **which version** of a prompt your code references without changing the code itself. ## Usage and Billing ### Data Retention This section covers how data retention works and how it's priced in LangSmith. #### Why retention matters * **Privacy**: Many data privacy regulations, such as GDPR in Europe or CCPA in California, require organizations to delete personal data once it's no longer necessary for the purposes for which it was collected. Setting retention periods aids in compliance with such regulations. * **Cost**: LangSmith charges less for traces that have low data retention. See our tutorial on how to [optimize spend](/langsmith/billing#optimize-your-tracing-spend) for details. #### How it works LangSmith has two tiers of traces based on Data Retention with the following characteristics: | | Base | Extended | | -------------------- | ----------------- | --------------- | | **Price** | \$.50 / 1k traces | \$5 / 1k traces | | **Retention Period** | 14 days | 400 days | **Data deletion after retention ends** After the specified retention period, traces are no longer accessible in the tracing project UI or via the API. All user data associated with the trace (e.g. inputs and outputs) is deleted from our internal systems within a day thereafter. Some metadata associated with each trace may be retained indefinitely for analytics and billing purposes. **Data retention auto-upgrades** Auto upgrades can have an impact on your bill. Please read this section carefully to fully understand your estimated LangSmith tracing costs. When you use certain features with `base` tier traces, their data retention will be automatically upgraded to `extended` tier. This will increase both the retention period, and the cost of the trace. The complete list of scenarios in which a trace will upgrade when: * **Feedback** is added to any run on the trace (or any trace in the thread), whether through [manual annotation](/langsmith/annotate-traces-inline#annotate-traces-and-runs-inline), automatically with [an online evaluator](/langsmith/online-evaluations), or programmatically [via the SDK](/langsmith/attach-user-feedback#log-user-feedback-using-the-sdk). * An **[annotation queue](/langsmith/annotation-queues#assign-runs-to-an-annotation-queue)** receives any run from the trace. * An **[automation rule](/langsmith/rules#set-up-automation-rules)** matches any run within a trace. **Why auto-upgrade traces?** We have two reasons behind the auto-upgrade model for tracing: 1. We think that traces that match any of these conditions are fundamentally more interesting than other traces, and therefore it is good for users to be able to keep them around longer. 2. We philosophically want to charge customers an order of magnitude lower for traces that may not be interacted with meaningfully. We think auto-upgrades align our pricing model with the value that LangSmith brings, where only traces with meaningful interaction are charged at a higher rate. If you have questions or concerns about our pricing model, please feel free to reach out to [support@langchain.dev](mailto:support@langchain.dev) and let us know your thoughts! **How does data retention affect downstream features?** * **Annotation Queues, Run Rules, and Feedback**: Traces that use these features will be [auto-upgraded](#data-retention-auto-upgrades). * **Monitoring**: The monitoring tab will continue to work even after a base tier trace's data retention period ends. It is powered by trace metadata that exists for >30 days, meaning that your monitoring graphs will continue to stay accurate even on `base` tier traces. * **Datasets**: Datasets have an indefinite data retention period. Restated differently, if you add a trace's inputs and outputs to a dataset, they will never be deleted. We suggest that if you are using LangSmith for data collection, you take advantage of the datasets feature. #### Billing model **Billable metrics** On your LangSmith invoice, you will see two metrics that we charge for: * LangSmith Traces (Base Charge) * LangSmith Traces (Extended Data Retention Upgrades). The first metric includes all traces, regardless of tier. The second metric just counts the number of extended retention traces. **Why measure all traces + upgrades instead of base and extended traces?** A natural question to ask when considering our pricing is why not just show the number of `base` tier and `extended` tier traces directly on the invoice? While we understand this would be more straightforward, it doesn't fit trace upgrades properly. Consider a `base` tier trace that was recorded on June 30, and upgraded to `extended` tier on July 3. The `base` tier trace occurred in the June billing period, but the upgrade occurred in the July billing period. Therefore, we need to be able to measure these two events independently to properly bill our customers. If your trace was recorded as an extended retention trace, then the `base` and `extended` metrics will both be recorded with the same timestamp. **Cost breakdown** The Base Charge for a trace is .05¢ per trace. We priced the upgrade such that an `extended` retention trace costs 10x the price of a base tier trace (.50¢ per trace) including both metrics. Thus, each upgrade costs .45¢. ### Rate Limits LangSmith has rate limits which are designed to ensure the stability of the service for all users. To ensure access and stability, LangSmith will respond with HTTP Status Code 429 indicating that rate or usage limits have been exceeded under the following circumstances: #### Temporary throughput limit over a 1 minute period at our application load balancer This 429 is the the result of exceeding a fixed number of API calls over a 1 minute window on a per API key/access token basis. The start of the window will vary slightly — it is not guaranteed to start at the start of a clock minute — and may change depending on application deployment events. After the max events are received we will respond with a 429 until 60 seconds from the start of the evaluation window has been reached and then the process repeats. This 429 is thrown by our application load balancer and is a mechanism in place for all LangSmith users independent of plan tier to ensure continuity of service for all users. | Method | Endpoints | Limit | Window | | ----------------- | ------------- | ----- | -------- | | `DELETE` | `/sessions*` | 30 | 1 minute | | `POST` OR `PATCH` | `/runs*` | 5000 | 1 minute | | `GET` | `/runs/:id` | 30 | 1 minute | | `POST` | `/feedbacks*` | 5000 | 1 minute | | `*` | `*` | 2000 | 1 minute | The LangSmith SDK takes steps to minimize the likelihood of reaching these limits on run-related endpoints by batching up to 100 runs from a single session ID into a single API call. #### Plan-level hourly trace event limit This 429 is the result of reaching your maximum hourly events ingested and is evaluated in a fixed window starting at the beginning of each clock hour in UTC and resets at the top of each new hour. An event in this context is the creation or update of a run. So if run is created, then subsequently updated in the same hourly window, that will count as 2 events against this limit. This is thrown by our application and varies by plan tier, with organizations on our Startup/Plus and Enterprise plan tiers having higher hourly limits than our Free and Developer Plan Tiers which are designed for personal use. | Plan | Limit | Window | | -------------------------------- | -------------- | ------ | | Developer (no payment on file) | 50,000 events | 1 hour | | Developer (with payment on file) | 250,000 events | 1 hour | | Startup/Plus | 500,000 events | 1 hour | | Enterprise | Custom | Custom | #### Plan-level hourly trace data ingest limit This 429 is the result of reaching the maximum amount of data ingested across your trace inputs, outputs, and metadata and is evaluated in a fixed window starting at the beginning of each clock hour in UTC and resets at the top of each new hour. Typically, inputs, outputs, and metadata are send on both run creation and update events. So if a run is created and is 2.0MB in size at creation, and 3.0MB in size when updated in the same hourly window, that will count as 5.0MB of storage against this limit. This is thrown by our application and varies by plan tier, with organizations on our Startup/Plus and Enterprise plan tiers having higher hourly limits than our Free and Developer Plan Tiers which are designed for personal use. | Plan | Limit | Window | | -------------------------------- | ------ | ------ | | Developer (no payment on file) | 500MB | 1 hour | | Developer (with payment on file) | 2.5GB | 1 hour | | Startup/Plus | 5.0GB | 1 hour | | Enterprise | Custom | Custom | #### Plan-level monthly unique traces limit This 429 is the result of reaching your maximum monthly traces ingested and is evaluated in a fixed window starting at the beginning of each calendar month in UTC and resets at the beginning of each new month. This is thrown by our application and applies only to the Developer Plan Tier when there is no payment method on file. | Plan | Limit | Window | | ------------------------------ | ------------ | ------- | | Developer (no payment on file) | 5,000 traces | 1 month | #### Self-configured monthly usage limits This 429 is the result of reaching your usage limit as configured by your organization admin and is evaluated in a fixed window starting at the beginning of each calendar month in UTC and resets at the beginning of each new month. This is thrown by our application and varies by organization based on their configured settings. #### Handling 429s responses in your application Since some 429 responses are temporary and may succeed on a successive call, if you are directly calling the LangSmith API in your application we recommend implementing retry logic with exponential backoff and jitter. For convenience, LangChain applications built with the LangSmith SDK has this capability built-in. It is important to note that if you are saturating the endpoints for extended periods of time, retries may not be effective as your application will eventually run large enough backlogs to exhaust all retries. If that is the case, we would like to discuss your needs more specifically. Please reach out to [LangSmith Support](mailto:support@langchain.dev) with details about your applications throughput needs and sample code and we can work with you to better understand whether the best approach is fixing a bug, changes to your application code, or a different LangSmith plan. ### Usage Limits LangSmith lets you configure usage limits on tracing. Note that these are *usage* limits, not *spend* limits, which mean they let you limit the quantity of occurrences of some event rather than the total amount you will spend. LangSmith lets you set two different monthly limits, mirroring our Billable Metrics discussed in the aforementioned data retention guide: * All traces limit * Extended data retention traces limit These let you limit the number of total traces, and extended data retention traces respectively. #### Properties of usage limiting Usage limiting is approximate, meaning that we do not guarantee the exactness of the limit. In rare cases, there may be a small period of time where additional traces are processed above the limit threshold before usage limiting begins to apply. #### Side effects of extended data retention traces limit The extended data retention traces limit has side effects. If the limit is already reached, any feature that could cause an auto-upgrade of tracing tiers becomes inaccessible. This is because an auto-upgrade of a trace would cause another extended retention trace to be created, which in turn should not be allowed by the limit. Therefore, you can no longer: 1. match run rules 2. add feedback to traces 3. add runs to annotation queues Each of these features may cause an auto upgrade, so we shut them off when the limit is reached. #### Updating usage limits Usage limits can be updated from the `Settings` page under `Usage and Billing`. Limit values are cached, so it may take a minute or two before the new limits apply. ### Related content * Tutorial on how to [optimize spend](/langsmith/billing#optimize-your-tracing-spend) ## Additional Resources * **[Release Versions](/langsmith/release-versions)**: Learn about LangSmith's version support policy, including Active, Critical, End of Life, and Deprecated support levels. *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/administration-overview.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Set up Agent Auth (Beta) Source: https://docs.langchain.com/langsmith/agent-auth Enable secure access from agents to any system using OAuth 2.0 credentials with Agent Auth. Agent Auth is in **Beta** and under active development. To provide feedback or use this feature, reach out to the [LangChain team](https://forum.langchain.com/c/help/langsmith/). ## Installation Install the Agent Auth client library from PyPI: ```bash pip theme={null} pip install langchain-auth ``` ```bash uv theme={null} uv add langchain-auth ``` ## Quickstart ### 1. Initialize the client ```python theme={null} from langchain_auth import Client client = Client(api_key="your-langsmith-api-key") ``` ### 2. Set up OAuth providers Before agents can authenticate, you need to configure an OAuth provider using the following process: 1. Select a unique identifier for your OAuth provider to use in LangChain's platform (e.g., "github-local-dev", "google-workspace-prod"). 2. Go to your OAuth provider's developer console and create a new OAuth application. 3. Set LangChain's API as an available callback URL using this structure: ``` https://api.host.langchain.com/v2/auth/callback/{provider_id} ``` For example, if your provider\_id is "github-local-dev", use: ``` https://api.host.langchain.com/v2/auth/callback/github-local-dev ``` 4. Use `client.create_oauth_provider()` with the credentials from your OAuth app: ```python theme={null} new_provider = await client.create_oauth_provider( provider_id="{provider_id}", # Provide any unique ID. Not formally tied to the provider. name="{provider_display_name}", # Provide any display name client_id="{your_client_id}", client_secret="{your_client_secret}", auth_url="{auth_url_of_your_provider}", token_url="{token_url_of_your_provider}", ) ``` ### 3. Authenticate from an agent The client `authenticate()` API is used to get OAuth tokens from pre-configured providers. On the first call, it takes the caller through an OAuth 2.0 auth flow. #### In LangGraph context By default, tokens are scoped to the calling agent using the Assistant ID parameter. ```python theme={null} auth_result = await client.authenticate( provider="{provider_id}", scopes=["scopeA"], user_id="your_user_id" # Any unique identifier to scope this token to the human caller ) # Or if you'd like a token that can be used by any agent, set agent_scoped=False auth_result = await client.authenticate( provider="{provider_id}", scopes=["scopeA"], user_id="your_user_id", agent_scoped=False ) ``` During execution, if authentication is required, the SDK will throw an [interrupt](https://langchain-ai.github.io/langgraph/how-tos/human_in_the_loop/add-human-in-the-loop/#pause-using-interrupt). The agent execution pauses and presents the OAuth URL to the user: Studio interrupt showing OAuth URL After the user completes OAuth authentication and we receive the callback from the provider, they will see the auth success page. GitHub OAuth success page The agent then resumes execution from the point it left off at, and the token can be used for any API calls. We store and refresh OAuth tokens so that future uses of the service by either the user or agent do not require an OAuth flow. ```python theme={null} token = auth_result.token ``` #### Outside LangGraph context Provide the `auth_url` to the user for out-of-band OAuth flows. ```python theme={null} # Default: user-scoped token (works for any agent under this user) auth_result = await client.authenticate( provider="{provider_id}", scopes=["scopeA"], user_id="your_user_id" ) if auth_result.needs_auth: print(f"Complete OAuth at: {auth_result.auth_url}") # Wait for completion completed_auth = await client.wait_for_completion(auth_result.auth_id) token = completed_auth.token else: token = auth_result.token ``` *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-auth.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Agent Builder Source: https://docs.langchain.com/langsmith/agent-builder Agent Builder is in Beta. Agent Builder lets you turn natural-language ideas into production agents. It's powered by [deep-agents](https://github.com/langchain-ai/deepagents), and is not workflow based. ## Memory and updates Agent Builder includes persistent agent memory and supports self-updates. This lets agents adapt over time and refine how they work without manual edits. * Persistent memory: Agents retain relevant information across runs to inform future decisions. * What can be updated: Tools (add, remove, or reconfigure), and instructions/system prompts. * Agents cannot modify their name, description, and/or triggers attached. ## Triggers Triggers define when your agent should start running. You can connect your agent to external tools or time-based schedules, letting it respond automatically to messages, emails, or recurring events. The following examples show some of the apps you can use to trigger your agent: Activate your agent when messages are received in specific Slack channels. Trigger your agent when emails are received. Run your agent on a time-based schedule for recurring tasks. ## Sub-agents Agent Builder lets you create sub-agents within a main agent. Sub-agents are smaller, specialized agents that handle specific parts of a larger task. They can operate with their own tools, permissions, or goals while coordinating with the main agent. Using sub-agents makes it easier to build complex systems by dividing work into focused, reusable components. This modular approach helps keep your agents organized, scalable, and easier to maintain. Below are a few ways sub-agents can be used in your projects: * Handle distinct parts of a broader workflow (for example, data retrieval, summarization, or formatting). * Use different tools or context windows for specialized tasks. * Run independently but report results back to the main agent. ## Human in the loop Human-in-the-loop functionality allows you to review and approve agent actions before they execute, giving you control over critical decisions. ### Enabling interrupts When configuring your agent in Agent Builder, select the tool you want to add human oversight to. Look for the interrupt option when selecting the tool and toggle it on. The agent will pause and wait for human approval before executing that tool. ### Actions on interrupts When your agent reaches an interrupt point, you can take one of three actions: Approve the agent's proposed action and allow it to proceed as planned. Modify the agent's message or parameters before allowing it to continue. Provide feedback to the agent. *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-builder.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # LangSmith Tool Server Source: https://docs.langchain.com/langsmith/agent-builder-mcp-framework The LangSmith Tool Server is our MCP Framework that powers the tools available in the LangSmith Agent Builder. This framework enables you to build and deploy custom tools that can be integrated with your agents. It provides a standardized way to create, deploy, and manage tools with built-in authentication and authorization. The PyPi package that defines the framework is available [here](https://pypi.org/project/langsmith-tool-server/). ## Quick start Install the LangSmith Tool Server and LangChain CLI: ```bash theme={null} pip install langsmith-tool-server pip install langchain-cli-v2 ``` Create a new toolkit: ```bash theme={null} langchain tools new my-toolkit cd my-toolkit ``` This creates a toolkit with the following structure: ``` my-toolkit/ ├── pyproject.toml ├── toolkit.toml └── my_toolkit/ ├── __init__.py ├── auth.py └── tools/ ├── __init__.py └── ... ``` Define your tools using the `@tool` decorator: ```python theme={null} from langsmith_tool_server import tool @tool def hello(name: str) -> str: """Greet someone by name.""" return f"Hello, {name}!" @tool def add(x: int, y: int) -> int: """Add two numbers.""" return x + y TOOLS = [hello, add] ``` Run the server: ```bash theme={null} langchain tools serve ``` Your tool server will start on `http://localhost:8000`. ## Simple client example Here's a simple example that lists available tools and calls the `add` tool: ```python theme={null} import asyncio import aiohttp async def mcp_request(url: str, method: str, params: dict = None): async with aiohttp.ClientSession() as session: payload = {"jsonrpc": "2.0", "method": method, "params": params or {}, "id": 1} async with session.post(f"{url}/mcp", json=payload) as response: return await response.json() async def main(): url = "http://localhost:8000" tools = await mcp_request(url, "tools/list") print(f"Tools: {tools}") result = await mcp_request(url, "tools/call", {"name": "add", "arguments": {"a": 5, "b": 3}}) print(f"Result: {result}") asyncio.run(main()) ``` ## Adding OAuth authentication For tools that need to access third-party APIs (like Google, GitHub, Slack, etc.), you can use OAuth authentication with [Agent Auth](/langsmith/agent-auth). Before using OAuth in your tools, you'll need to configure an OAuth provider in your LangSmith workspace settings. See the [Agent Auth documentation](/langsmith/agent-auth) for setup instructions. Once configured, specify the `auth_provider` in your tool decorator: ```python theme={null} from langsmith_tool_server import tool, Context from google.oauth2.credentials import Credentials from googleapiclient.discovery import build @tool( auth_provider="google", scopes=["https://www.googleapis.com/auth/gmail.readonly"], integration="gmail" ) async def read_emails(context: Context, max_results: int = 10) -> str: """Read recent emails from Gmail.""" credentials = Credentials(token=context.token) service = build('gmail', 'v1', credentials=credentials) # ... Gmail API calls return f"Retrieved {max_results} emails" ``` Tools with `auth_provider` must: * Have `context: Context` as the first parameter * Specify at least one scope * Use `context.token` to make authenticated API calls ## Using as an MCP gateway The LangSmith Tool Server can act as an MCP gateway, aggregating tools from multiple MCP servers into a single endpoint. Configure MCP servers in your `toolkit.toml`: ```toml theme={null} [toolkit] name = "my-toolkit" tools = "./my_toolkit/__init__.py:TOOLS" [[mcp_servers]] name = "weather" transport = "streamable_http" url = "http://localhost:8001/mcp/" [[mcp_servers]] name = "math" transport = "stdio" command = "python" args = ["-m", "mcp_server_math"] ``` All tools from connected MCP servers are exposed through your server's `/mcp` endpoint. MCP tools are prefixed with their server name to avoid conflicts (e.g., `weather.get_forecast`, `math.add`). ## Custom authentication Custom authentication allows you to validate requests and integrate with your identity provider. Define an authentication handler in your `auth.py` file: ```python theme={null} from langsmith_tool_server import Auth auth = Auth() @auth.authenticate async def authenticate(authorization: str = None) -> dict: """Validate requests and return user identity.""" if not authorization or not authorization.startswith("Bearer "): raise auth.exceptions.HTTPException( status_code=401, detail="Unauthorized" ) token = authorization.replace("Bearer ", "") # Validate token with your identity provider user = await verify_token_with_idp(token) return {"identity": user.id} ``` The handler runs on every request and must return a dict with `identity` (and optionally `permissions`). *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-builder-mcp-framework.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Agent Builder setup Source: https://docs.langchain.com/langsmith/agent-builder-setup Add required workspace secrets for models and tools used by Agent Builder. This page lists the workspace secrets you need to add before using Agent Builder. Add these in your LangSmith workspace settings under Secrets. Keep values scoped to your workspace and avoid placing credentials in prompts or code. ## How to add workspace secrets In the [LangSmith UI](https://smith.langchain.com), ensure that your Anthropic API key is set as a [workspace secret](/langsmith/administration-overview#workspace-secrets). 1. Navigate to **Settings** and then move to the **Secrets** tab. 2. Select **Add secret** and enter the `ANTHROPIC_API_KEY` and your API key as the **Value**. 3. Select **Save secret**. When adding workspace secrets in the LangSmith UI, make sure the secret keys match the environment variable names expected by your model provider. ## Required model key * `ANTHROPIC_API_KEY`: Required for Agent Builder models. The agent graphs load this key from workspace secrets for inference. ## Optional tool keys Add keys for any tools you enable. These are read from workspace secrets at runtime. * `EXA_API_KEY`: Required for Exa search tools (general web and LinkedIn profile search). * `TAVILY_API_KEY`: Required for Tavily web search. * `TWITTER_API_KEY` and `TWITTER_API_KEY_SECRET`: Required for Twitter/X read operations (app‑only bearer). Posting/media upload is not enabled. *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-builder-setup.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # LangSmith Agent Builder Slack App Source: https://docs.langchain.com/langsmith/agent-builder-slack-app Connect the LangSmith Agent Builder to your Slack workspace to power AI agents. The LangSmith Agent Builder Slack app integrates your agents with Slack for secure, context-aware communication inside your Slack workspace. After installation, your agents will be able to: * Send direct messages. * Post to channels. * Read thread messages. * Reply in threads. * Read conversation history. ## How to install To install the LangSmith Agent Builder for Slack: 1. Navigate to Agent Builder in your [LangSmith workspace](https://smith.langchain.com). 2. Create or edit an agent. 3. Add Slack as a trigger or enable Slack tools. 4. When prompted, authorize the Slack connection. 5. Follow the OAuth flow to grant permissions to your Slack workspace. The app will be installed automatically when you complete the authorization. ## Permissions The LangSmith Agent Builder requires the following permissions to your Slack workspace: * **Send messages** - Send direct messages and post to channels * **Read messages** - Read channel history and thread messages * **View channels** - Access basic channel information * **View users** - Look up user information for messaging These permissions enable agents to communicate effectively within your Slack workspace. ## Privacy policy The LangSmith Agent Builder Slack app collects, manages, and stores third-party data in accordance with our privacy policy. For full details on how your data is handled, please see [our privacy policy](https://www.langchain.com/privacy-policy). ## AI components and disclaimers The LangSmith Agent Builder uses large language models (LLMs) to power AI agents that interact with users in Slack. While these models are powerful, they have the potential to generate inaccurate responses, summaries, or other outputs. ### What you should know * **AI-generated content**: All responses from agents are generated by AI and may contain errors or inaccuracies. Always verify important information. * **Data usage**: Slack data is not used to train LLMs. Your workspace data remains private and is only used to provide agent functionality. * **Transparency**: The Agent Builder is transparent about the actions it will take once added to your workspace, as outlined in the permissions section above. ### Technical details The Agent Builder uses the following approach to AI: * **Model**: Uses LLMs provided through the LangSmith platform * **Data retention**: User data is retained according to LangSmith's data retention policies * **Data tenancy**: Data is handled according to your LangSmith organization settings * **Data residency**: Data residency follows your LangSmith configuration For more information about AI safety and best practices, see the [Agent Builder documentation](/langsmith/agent-builder). ## Pricing The LangSmith Agent Builder Slack app itself does not have any direct pricing. However, agent runs and traces are billed through the [LangSmith platform](https://smith.langchain.com) according to your organization's plan. For current pricing information, see the [LangSmith pricing page](https://www.langchain.com/pricing). *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-builder-slack-app.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Supported tools Source: https://docs.langchain.com/langsmith/agent-builder-tools Use these built-in tools to give your agents access to email, calendars, chat, project management, search, social, and general web utilities. Google, Slack, Linear, and LinkedIn use OAuth. Exa, Tavily, and Twitter/X use workspace secrets. Read and send email
  • Read emails (optionally include body, filter with search)
  • Send email or reply to an existing message
  • Create draft emails
  • Mark messages as read
  • Get a conversation thread
  • Apply or create labels
  • List mailbox labels
Send and read messages
  • Send a direct message to a user
  • Post a message to a channel
  • Reply in a thread
  • Read channel history
  • Read thread messages
  • Exa web search (optionally fetch page contents)
  • Exa LinkedIn profile search
  • Tavily web search
Post to profile
  • Publish a post with optional image or link
Manage events
  • List events for a date
  • Get event details
  • Create new events
Manage issues and teams
  • List teams and team members
  • List issues with filters
  • Get issue details
  • Create, update, or delete issues
  • Read a tweet by ID
  • Read recent posts from a list
  • Read webpage text content
  • Extract image URLs and metadata
  • Notify user (for confirmations/updates)
*** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-builder-tools.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Agent Server Source: https://docs.langchain.com/langsmith/agent-server LangSmith Deployment's **Agent Server** offers an API for creating and managing agent-based applications. It is built on the concept of [assistants](/langsmith/assistants), which are agents configured for specific tasks, and includes built-in [persistence](/oss/python/langgraph/persistence#memory-store) and a **task queue**. This versatile API supports a wide range of agentic application use cases, from background processing to real-time interactions. Use Agent Server to create and manage [assistants](/langsmith/assistants), [threads](/oss/python/langgraph/persistence#threads), [runs](/langsmith/assistants#execution), [cron jobs](/langsmith/cron-jobs), [webhooks](/langsmith/use-webhooks), and more. **API reference**
For detailed information on the API endpoints and data models, refer to the [API reference docs](https://langchain-ai.github.io/langgraph/cloud/reference/api/api_ref.html).
To use the Enterprise version of the Agent Server, you must acquire a license key that you will need to specify when running the Docker image. To acquire a license key, [contact our sales team](https://www.langchain.com/contact-sales). You can run the Enterprise version of the Agent Server on the following LangSmith [platform](/langsmith/platform-setup) options: * [Cloud](/langsmith/cloud) * [Hybrid](/langsmith/hybrid) * [Self-hosted](/langsmith/self-hosted) ## Application structure To deploy an Agent Server application, you need to specify the graph(s) you want to deploy, as well as any relevant configuration settings, such as dependencies and environment variables. Read the [application structure](/langsmith/application-structure) guide to learn how to structure your LangGraph application for deployment. ## Parts of a deployment When you deploy Agent Server, you are deploying one or more [graphs](#graphs), a database for [persistence](/oss/python/langgraph/persistence), and a task queue. ### Graphs When you deploy a graph with Agent Server, you are deploying a "blueprint" for an [Assistant](/langsmith/assistants). An [Assistant](/langsmith/assistants) is a graph paired with specific configuration settings. You can create multiple assistants per graph, each with unique settings to accommodate different use cases that can be served by the same graph. Upon deployment, Agent Server will automatically create a default assistant for each graph using the graph's default configuration settings. We often think of a graph as implementing an [agent](/oss/python/langgraph/workflows-agents), but a graph does not necessarily need to implement an agent. For example, a graph could implement a simple chatbot that only supports back-and-forth conversation, without the ability to influence any application control flow. In reality, as applications get more complex, a graph will often implement a more complex flow that may use [multiple agents](/oss/python/langchain/multi-agent) working in tandem. ### Persistence and task queue Agent Server leverages a database for [persistence](/oss/python/langgraph/persistence) and a task queue. [PostgreSQL](https://www.postgresql.org/) is supported as a database for Agent Server and [Redis](https://redis.io/) as the task queue. If you're deploying using [LangSmith cloud](/langsmith/cloud), these components are managed for you. If you're deploying Agent Server on your [own infrastructure](/langsmith/self-hosted), you'll need to set up and manage these components yourself. For more information on how these components are set up and managed, review the [hosting options](/langsmith/platform-setup) guide. ## Learn more * [Application Structure](/langsmith/application-structure) guide explains how to structure your application for deployment. * The [API Reference](https://langchain-ai.github.io/langgraph/cloud/reference/api/api_ref.html) provides detailed information on the API endpoints and data models. *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-server.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Agent Server Source: https://docs.langchain.com/langsmith/agent-server-api-ref *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-server-api-ref.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Agent Server changelog Source: https://docs.langchain.com/langsmith/agent-server-changelog [Agent Server](/langsmith/agent-server) is an API platform for creating and managing agent-based applications. It provides built-in persistence, a task queue, and supports deploying, configuring, and running assistants (agentic workflows) at scale. This changelog documents all notable updates, features, and fixes to Agent Server releases.
## v0.5.24 * Added executor metrics for Datadog and enhanced core stream API metrics for better performance tracking. * Disabled Redis Go maintenance notifications to prevent startup errors with unsupported commands in Redis versions below 8. ## v0.5.20 * Resolved an error in the executor service that occurred when handling large messages. ## v0.5.19 * Upgraded built-in langchain-core to version 1.0.7 to address a prompt formatting vulnerability. ## v0.5.18 * Introduced persistent cron threads with `on_run_completed: {keep,delete}` for enhanced cron management and retrieval options. ## v0.5.17 * Enhanced task handling to support multiple interrupts, aligning with open-source functionality. ## v0.5.15 * Added custom JSON unmarshalling for `Resume` and `Goto` commands to fix map-style null resume interpretation issues. ## v0.5.14 * Ensured `pg make start` command functions correctly with core-api enabled. ## v0.5.13 * Support `include` and `exclude` (plural form key for `includes` and `excludes`) since a doc incorrectly claimed support for that. Now the server accepts either. ## v0.5.11 * Ensured auth handlers are applied consistently when streaming threads, aligning with recent security practices. * Bumped `undici` dependency from version 6.21.3 to 7.16.0, introducing various performance improvements and bug fixes. * Updated `p-queue` from version 8.0.1 to 9.0.0, introducing new features and breaking changes, including the removal of the `throwOnTimeout` option. ## v0.5.10 * Implemented healthcheck calls in the queue /ok handler to improve Kubernetes liveness and readiness probe compatibility. ## v0.5.9 * Resolved an issue causing an "unbound local error" for the `elapsed` variable during a SIGINT interruption. * Mapped the "interrupted" status to A2A's "input-required" status for better task status alignment. ## v0.5.8 * Ensured environment variables are passed as a dictionary when starting langgraph-ui for compatibility with `uvloop`. * Implemented CRUD operations for runs in Go, simplifying JSON merges and improving transaction readability, with PostgreSQL as a reference. ## v0.5.7 * Replaced no-retry Redis client with a retry client to handle connection errors more effectively and reduced corresponding logging severity. ## v0.5.6 * Added pending time metrics to provide better insights into task waiting times. * Replaced `pb.Value` with `ChannelValue` to streamline code structure. ## v0.5.5 * Made the Redis `health_check_interval` more frequent and configurable for better handling of idle connections. ## v0.5.4 * Implemented `ormsgpack` with `OPT_REPLACE_SURROGATES` and updated for compatibility with the latest FastAPI release affecting custom authentication dependencies. ## v0.5.2 * Added retry logic for PostgreSQL connections during startup to enhance deployment reliability and improved error logging for easier debugging. ## v0.5.1 * Resolved an issue where persistence was not functioning correctly with LangChain.js's createAgent feature. * Optimized assistants CRUD performance by improving database connection pooling and gRPC client reuse, reducing latency for large payloads. ## v0.5.0 * Updated dependency requirements to support the latest security patch, removed JSON fallback for serialization, and adjusted deserialization behavior for enhanced security. ## v0.4.47 * Validated and auto-corrected environment configuration types using TypeAdapter. * Added support for LangChain.js and LangGraph.js version 1.x, ensuring compatibility. * Updated hono library from version 4.9.7 to 4.10.3, addressing a CORS middleware security issue and enhancing JWT audience validation. * Introduced a modular benchmark framework, adding support for assistants and streams, with improvements to the existing ramp benchmark methodology. * Introduced a gRPC API for core threads CRUD operations, with updated Python and TypeScript clients. * Updated `hono` package from version 4.9.7 to 4.10.2, including security improvements for JWT audience validation. * Updated `hono` dependency from version 4.9.7 to 4.10.3 to fix a security issue and improve CORS middleware handling. * Introduced basic CRUD operations for threads, including create, get, patch, delete, search, count, and copy, with support for Go, gRPC server, and Python and TypeScript clients. ## v0.4.46 * Added an option to enable message streaming from subgraph events, giving users more control over event notifications. ## v0.4.45 * Implemented support for authorization on custom routes, controlled by the `enable_custom_route_auth` flag. * Set default tracing to off for improved performance and simplified debugging. ## v0.4.44 * Used Redis key prefix for license-related keys to prevent conflicts with existing setups. ## v0.4.43 * Implemented a health check for Redis connections to prevent them from idling out. ## v0.4.40 * Prevented duplicate messages in resumable run and thread streams by addressing a race condition and adding tests to ensure consistent behavior. * Ensured that runs don't start until the pubsub subscription is confirmed to prevent message drops on startup. * Renamed platform from langgraph to improve clarity and branding. * Reset PostgreSQL connections after use to prevent lock holding and improved error reporting for transaction issues. ## v0.4.39 * Upgraded `hono` from version 4.7.6 to 4.9.7, addressing a security issue related to the `bodyLimit` middleware. * Allowed customization of the base authentication URL to enhance flexibility. * Pinned the 'ty' dependency to a stable version using 'uv' to prevent unexpected linting failures. ## v0.4.38 * Replaced `LANGSMITH_API_KEY` with `LANGSMITH_CONTROL_PLANE_API_KEY` to support hybrid deployments requiring license verification. * Introduced self-hosted log ingestion support, configurable via `SELF_HOSTED_LOGS_ENABLED` and `SELF_HOSTED_LOGS_ENDPOINT` environment variables. ## v0.4.37 * Required create permissions for copying threads to ensure proper authorization. ## v0.4.36 * Improved error handling and added a delay to the sweep loop for smoother operation during Redis downtime or cancellation errors. * Updated the queue entrypoint to start the core-api gRPC server when `FF_USE_CORE_API` is enabled. * Introduced checks for invalid configurations in assistant endpoints to ensure consistency with other endpoints. ## v0.4.35 * Resolved a timezone issue in the core API, ensuring accurate time data retrieval. * Introduced a new `middleware_order` setting to apply authentication middleware before custom middleware, allowing finer control over protected route configurations. * Logged the Redis URL when errors occur during Redis client creation. * Improved Go engine/runtime context propagation to ensure consistent execution flow. * Removed the unnecessary `assistants.put` call from the executor entrypoint to streamline the process. ## v0.4.34 * Blocked unauthorized users from updating thread TTL settings to enhance security. ## v0.4.33 * Improved error handling for Redis locks by logging `LockNotOwnedError` and extending initial pool migration lock timeout to 60 seconds. * Updated the BaseMessage schema to align with the latest langchain-core version and synchronized build dependencies for consistent local development. ## v0.4.32 * Added a GO persistence layer to the API image, enabling GRPC server operation with PostgreSQL support and enhancing configurability. * Set the status to error when a timeout occurs to improve error handling. ## v0.4.30 * Added support for context when using `stream_mode="events"` and included new tests for this functionality. * Added support for overriding the server port using `$LANGGRAPH_SERVER_PORT` and removed an unnecessary Dockerfile `ARG` for cleaner configuration. * Applied authorization filters to all table references in thread delete CTE to enhance security. * Introduced self-hosted metrics ingestion capability, allowing metrics to be sent to an OTLP collector every minute when the corresponding environment variables are set. * Ensured that the `set_latest` function properly updates the name and description of the version. ## v0.4.29 * Ensured proper cleanup of redis pubsub connections in all scenarios. ## v0.4.28 * Added a format parameter to the queue metrics server for enhanced customization. * Corrected `MOUNT_PREFIX` environment variable usage in CLI for consistency with documentation and to prevent confusion. * Added a feature to log warnings when messages are dropped due to no subscribers, controllable via a feature flag. * Added support for Bookworm and Bullseye distributions in Node images. * Consolidated executor definitions by moving them from the `langgraph-go` repository, improving manageability and updating the checkpointer setup method for server migrations. * Ensured correct response headers are sent for a2a, improving compatibility and communication. * Consolidated PostgreSQL checkpoint implementation, added CI testing for the `/core` directory, fixed RemoteStore test errors, and enhanced the Store implementation with transactions. * Added PostgreSQL migrations to the queue server to prevent errors from graphs being added before migrations are performed. ## v0.4.27 * Replaced `coredis` with `redis-py` to improve connection handling and reliability under high traffic loads. ## v0.4.24 * Added functionality to return full message history for A2A calls in accordance with the A2A spec. * Added a `LANGGRAPH_SERVER_HOST` environment variable to Dockerfiles to support custom host settings for dual stack mode. ## v0.4.23 * Use a faster message codec for redis streaming. ## v0.4.22 * Ported long-stream handling to the run stream, join, and cancel endpoints for improved stream management. ## v0.4.21 * Added A2A streaming functionality and enhanced testing with the A2A SDK. * Added Prometheus metrics to track language usage in graphs, middleware, and authentication for improved insights. * Fixed bugs in Open Source Software related to message conversion for chunks. * Removed await from pubsub subscribes to reduce flakiness in cluster tests and added retries in the shutdown suite to enhance API stability. ## v0.4.20 * Optimized Pubsub initialization to prevent overhead and address subscription timing issues, ensuring smoother run execution. ## v0.4.19 * Removed warnings from psycopg by addressing function checks introduced in version 3.2.10. ## v0.4.17 * Filtered out logs with mount prefix to reduce noise in logging output. ## v0.4.16 * Added support for implicit thread creation in a2a to streamline operations. * Improved error serialization and emission in distributed runtime streams, enabling more comprehensive testing. ## v0.4.13 * Monitored queue status in the health endpoint to ensure correct behavior when PostgreSQL fails to initialize. * Addressed an issue with unequal swept ID lengths to improve log clarity. * Enhanced streaming outputs by avoiding re-serialization of DR payloads, using msgpack byte inspection for json-like parsing. ## v0.4.12 * Ensured metrics are returned even when experiencing database connection issues. * Optimized update streams to prevent unnecessary data transmission. * Upgraded `hono` from version 4.9.2 to 4.9.6 in the `storage_postgres/langgraph-api-server` for improved URL path parsing security. * Added retries and an in-memory cache for LangSmith access calls to improve resilience against single failures. ## v0.4.11 * Added support for TTL (time-to-live) in thread updates. ## v0.4.10 * In distributed runtime, update serde logic for final checkpoint -> thread setting. ## v0.4.9 * Added support for filtering search results by IDs in the search endpoint for more precise queries. * Included configurable headers for assistant endpoints to enhance request customization. * Implemented a simple A2A endpoint with support for agent card retrieval, task creation, and task management. ## v0.4.7 * Stopped the inclusion of x-api-key to enhance security. ## v0.4.6 * Fixed a race condition when joining streams, preventing duplicate start events. ## v0.4.5 * Ensured the checkpointer starts and stops correctly before and after the queue to improve shutdown and startup efficiency. * Resolved an issue where workers were being prematurely cancelled when the queue was cancelled. * Prevented queue termination by adding a fallback for cases when Redis fails to wake a worker. ## v0.4.4 * Set the custom auth thread\_id to None for stateless runs to prevent conflicts. * Improved Redis signaling in the Go runtime by adding a wakeup worker and Redis lock implementation, and updated sweep logic. ## v0.4.3 * Added stream mode to thread stream for improved data processing. * Added a durability parameter to runs for improved data persistence. ## v0.4.2 * Ensured pubsub is initialized before creating a run to prevent errors from missing messages. ## v0.4.0 * Emitted attempt messages correctly within the thread stream. * Reduced cluster conflicts by using only the thread ID for hashing in cluster mapping, prioritizing efficiency with stream\_thread\_cache. * Introduced a stream endpoint for threads to track all outputs across sequentially executed runs. * Made the filter query builder in PostgreSQL more robust against malformed expressions and improved validation to prevent potential security risks. ## v0.3.4 * Added custom Prometheus metrics for Redis/PG connection pools and switched the queue server to Uvicorn/Starlette for improved monitoring. * Restored Wolfi image build by correcting shell command formatting and added a Makefile target for testing with nginx. ## v0.3.3 * Added timeouts to specific Redis calls to prevent workers from being left active. * Updated the Golang runtime and added pytest skips for unsupported functionalities, including initial support for passing store to node and message streaming. * Introduced a reverse proxy setup for serving combined Python and Node.js graphs, with nginx handling server routing, to facilitate a Postgres/Redis backend for the Node.js API server. ## v0.3.1 * Added a statement timeout to the pool to prevent long-running queries. ## v0.3.0 * Set a default 15-minute statement timeout and implemented monitoring for long-running queries to ensure system efficiency. * Stop propagating run configurable values to the thread configuration, because this can cause issues on subsequent runs if you are specifying a checkpoint\_id. This is a **slight breaking change** in behavior, since the thread value will no longer automatically reflect the unioned configuration of the most recent run. We believe this behavior is more intuitive, however. * Enhanced compatibility with older worker versions by handling event data in channel names within ops.py. ## v0.2.137 * Fixed an unbound local error and improved logging for thread interruptions or errors, along with type updates. ## v0.2.136 * Added enhanced logging to aid in debugging metaview issues. * Upgraded executor and runtime to the latest version for improved performance and stability. ## v0.2.135 * Ensured async coroutines are properly awaited to prevent potential runtime errors. ## v0.2.134 * Enhanced search functionality to improve performance by allowing users to select specific columns for query results. ## v0.2.133 * Added count endpoints for crons, threads, and assistants to enhance data tracking (#1132). * Improved SSH functionality for better reliability and stability. * Updated @langchain/langgraph-api to version 0.0.59 to fix an invalid state schema issue. ## v0.2.132 * Added Go language images to enhance project compatibility and functionality. * Printed internal PIDs for JS workers to facilitate process inspection via SIGUSR1 signal. * Resolved a `run_pkey` error that occurred when attempting to insert duplicate runs. * Added `ty run` command and switched to using uuid7 for generating run IDs. * Implemented the initial Golang runtime to expand language support. ## v0.2.131 * Added support for `object agent spec` with descriptions in JS. ## v0.2.130 * Added a feature flag (FF\_RICH\_THREADS=false) to disable thread updates on run creation, reducing lock contention and simplifying thread status handling. * Utilized existing connections for `aput` and `apwrite` operations to improve performance. * Improved error handling for decoding issues to enhance data processing reliability. * Excluded headers from logs to improve security while maintaining runtime functionality. * Fixed an error that prevented mapping slots to a single node. * Added debug logs to track node execution in JS deployments for improved issue diagnosis. * Changed the default multitask strategy to enqueue, improving throughput by eliminating the need to fetch inflight runs during new run insertions. * Optimized database operations for `Runs.next` and `Runs.sweep` to reduce redundant queries and improve efficiency. * Improved run creation speed by skipping unnecessary inflight runs queries. ## v0.2.129 * Stopped passing internal LGP fields to context to prevent breaking type checks. * Exposed content-location headers to ensure correct resumability behavior in the API. ## v0.2.128 * Ensured synchronized updates between `configurable` and `context` in assistants, preventing setup errors and supporting smoother version transitions. ## v0.2.127 * Excluded unrequested stream modes from the resumable stream to optimize functionality. ## v0.2.126 * Made access logger headers configurable to enhance logging flexibility. * Debounced the Runs.stats function to reduce the frequency of expensive calls and improve performance. * Introduced debouncing for sweepers to enhance performance and efficiency (#1147). * Acquired a lock for TTL sweeping to prevent database spamming during scale-out operations. ## v0.2.125 * Updated tracing context replicas to use the new format, ensuring compatibility. ## v0.2.123 * Added an entrypoint to the queue replica for improved deployment management. ## v0.2.122 * Utilized persisted interrupt status in `join` to ensure correct handling of user's interrupt state after completion. ## v0.2.121 * Consolidated events to a single channel to prevent race conditions and optimize startup performance. * Ensured custom lifespans are invoked on queue workers for proper setup, and added tests. ## v0.2.120 * Restored the original streaming behavior of runs, ensuring consistent inclusion of interrupt events based on `stream_mode` settings. * Optimized `Runs.next` query to reduce average execution time from \~14.43ms to \~2.42ms, improving performance. * Added support for stream mode "tasks" and "checkpoints", normalized the UI namespace, and upgraded `@langchain/langgraph-api` for enhanced functionality. ## v0.2.117 * Added a composite index on threads for faster searches with owner-based authentication and updated the default sort order to `updated_at` for improved query performance. ## v0.2.116 * Reduced the default number of history checkpoints from 10 to 1 to optimize performance. ## v0.2.115 * Optimized cache re-use to enhance application performance and efficiency. ## v0.2.113 * Improved thread search pagination by updating response headers with `X-Pagination-Total` and `X-Pagination-Next` for better navigation. ## v0.2.112 * Ensured sync logging methods are awaited and added a linter to prevent future occurrences. * Fixed an issue where JavaScript tasks were not being populated correctly for JS graphs. ## v0.2.111 * Fixed JS graph streaming failure by starting the heartbeat as soon as the connection opens. ## v0.2.110 * Added interrupts as default values for join operations while preserving stream behavior. ## v0.2.109 * Fixed an issue where config schema was missing when `config_type` was not set, ensuring more reliable configurations. ## v0.2.108 * Prepared for LangGraph v0.6 compatibility with new context API support and bug fixes. ## v0.2.107 * Implemented caching for authentication processes to enhance performance and efficiency. * Optimized database performance by merging count and select queries. ## v0.2.106 * Made log streams resumable, enhancing reliability and improving user experience when reconnecting. ## v0.2.105 * Added a heapdump endpoint to save memory heap information to a file. ## v0.2.103 * Used the correct metadata endpoint to resolve issues with data retrieval. ## v0.2.102 * Captured interrupt events in the wait method to preserve previous behavior from langgraph 0.5.0. * Added support for SDK structlog in the JavaScript environment for enhanced logging capabilities. ## v0.2.101 * Corrected the metadata endpoint for self-hosted deployments. ## v0.2.99 * Improved license check by adding an in-memory cache and handling Redis connection errors more effectively. * Reloaded assistants to preserve manually created ones while discarding those removed from the configuration file. * Reverted changes to ensure the UI namespace for gen UI is a valid JavaScript property name. * Ensured that the UI namespace for generated UI is a valid JavaScript property name, improving API compliance. * Enhanced error handling to return a 422 status code for unprocessable entity requests. ## v0.2.98 * Added context to langgraph nodes to improve log filtering and trace visibility. ## v0.2.97 * Improved interoperability with the ckpt ingestion worker on the main loop to prevent task scheduling issues. * Delayed queue worker startup until after migrations are completed to prevent premature execution. * Enhanced thread state error handling by adding specific metadata and improved response codes for better clarity when state updates fail during creation. * Exposed the interrupt ID when retrieving the thread state to improve API transparency. ## v0.2.96 * Added a fallback mechanism for configurable header patterns to handle exclude/include settings more effectively. ## v0.2.95 * Avoided setting the future if it is already done to prevent redundant operations. * Resolved compatibility errors in CI by switching from `typing.TypedDict` to `typing_extensions.TypedDict` for Python versions below 3.12. ## v0.2.94 * Improved performance by omitting pending sends for langgraph versions 0.5 and above. * Improved server startup logs to provide clearer warnings when the DD\_API\_KEY environment variable is set. ## v0.2.93 * Removed the GIN index for run metadata to improve performance. ## v0.2.92 * Enabled copying functionality for blobs and checkpoints, improving data management flexibility. ## v0.2.91 * Reduced writes to the `checkpoint_blobs` table by inlining small values (null, numeric, str, etc.). This means we don't need to store extra values for channels that haven't been updated. ## v0.2.90 * Improve checkpoint writes via node-local background queueing. ## v0.2.89 * Decoupled checkpoint writing from thread/run state by removing foreign keys and updated logger to prevent timeout-related failures. ## v0.2.88 * Removed the foreign key constraint for `thread` in the `run` table to simplify database schema. ## v0.2.87 * Added more detailed logs for Redis worker signaling to improve debugging. ## v0.2.86 * Honored tool descriptions in the `/mcp` endpoint to align with expected functionality. ## v0.2.85 * Added support for the `on_disconnect` field to `runs/wait` and included disconnect logs for better debugging. ## v0.2.84 * Removed unnecessary status updates to streamline thread handling and updated version to 0.2.84. ## v0.2.83 * Reduced the default time-to-live for resumable streams to 2 minutes. * Enhanced data submission logic to send data to both Beacon and LangSmith instance based on license configuration. * Enabled submission of self-hosted data to a LangSmith instance when the endpoint is configured. ## v0.2.82 * Addressed a race condition in background runs by implementing a lock using join, ensuring reliable execution across CTEs. ## v0.2.81 * Optimized run streams by reducing initial wait time to improve responsiveness for older or non-existent runs. ## v0.2.80 * Corrected parameter passing in the `logger.ainfo()` API call to resolve a TypeError. ## v0.2.79 * Fixed a JsonDecodeError in checkpointing with remote graph by correcting JSON serialization to handle trailing slashes properly. * Introduced a configuration flag to disable webhooks globally across all routes. ## v0.2.78 * Added timeout retries to webhook calls to improve reliability. * Added HTTP request metrics, including a request count and latency histogram, for enhanced monitoring capabilities. ## v0.2.77 * Added HTTP metrics to improve performance monitoring. * Changed the Redis cache delimiter to reduce conflicts with subgraph message names and updated caching behavior. ## v0.2.76 * Updated Redis cache delimiter to prevent conflicts with subgraph messages. ## v0.2.74 * Scheduled webhooks in an isolated loop to ensure thread-safe operations and prevent errors with PYTHONASYNCIODEBUG=1. ## v0.2.73 * Fixed an infinite frame loop issue and removed the dict\_parser due to structlog's unexpected behavior. * Throw a 409 error on deadlock occurrence during run cancellations to handle lock conflicts gracefully. ## v0.2.72 * Ensured compatibility with future langgraph versions. * Implemented a 409 response status to handle deadlock issues during cancellation. ## v0.2.71 * Improved logging for better clarity and detail regarding log types. ## v0.2.70 * Improved error handling to better distinguish and log TimeoutErrors caused by users from internal run timeouts. ## v0.2.69 * Added sorting and pagination to the crons API and updated schema definitions for improved accuracy. ## v0.2.66 * Fixed a 404 error when creating multiple runs with the same thread\_id using `on_not_exist="create"`. ## v0.2.65 * Ensured that only fields from `assistant_versions` are returned when necessary. * Ensured consistent data types for in-memory and PostgreSQL users, improving internal authentication handling. ## v0.2.64 * Added descriptions to version entries for better clarity. ## v0.2.62 * Improved user handling for custom authentication in the JS Studio. * Added Prometheus-format run statistics to the metrics endpoint for better monitoring. * Added run statistics in Prometheus format to the metrics endpoint. ## v0.2.61 * Set a maximum idle time for Redis connections to prevent unnecessary open connections. ## v0.2.60 * Enhanced error logging to include traceback details for dictionary operations. * Added a `/metrics` endpoint to expose queue worker metrics for monitoring. ## v0.2.57 * Removed CancelledError from retriable exceptions to allow local interrupts while maintaining retriability for workers. * Introduced middleware to gracefully shut down the server after completing in-flight requests upon receiving a SIGINT. * Reduced metadata stored in checkpoint to only include necessary information. * Improved error handling in join runs to return error details when present. ## v0.2.56 * Improved application stability by adding a handler for SIGTERM signals. ## v0.2.55 * Improved the handling of cancellations in the queue entrypoint. * Improved cancellation handling in the queue entry point. ## v0.2.54 * Enhanced error message for LuaLock timeout during license validation. * Fixed the \$contains filter in custom auth by requiring an explicit ::text cast and updated tests accordingly. * Ensured project and tenant IDs are formatted as UUIDs for consistency. ## v0.2.53 * Resolved a timing issue to ensure the queue starts only after the graph is registered. * Improved performance by setting thread and run status in a single query and enhanced error handling during checkpoint writes. * Reduced the default background grace period to 3 minutes. ## v0.2.52 * Now logging expected graphs when one is omitted to improve traceability. * Implemented a time-to-live (TTL) feature for resumable streams. * Improved query efficiency and consistency by adding a unique index and optimizing row locking. ## v0.2.51 * Handled `CancelledError` by marking tasks as ready to retry, improving error management in worker processes. * Added LG API version and request ID to metadata and logs for better tracking. * Added LG API version and request ID to metadata and logs to improve traceability. * Improved database performance by creating indexes concurrently. * Ensured postgres write is committed only after the Redis running marker is set to prevent race conditions. * Enhanced query efficiency and reliability by adding a unique index on thread\_id/running, optimizing row locks, and ensuring deterministic run selection. * Resolved a race condition by ensuring Postgres updates only occur after the Redis running marker is set. ## v0.2.46 * Introduced a new connection for each operation while preserving transaction characteristics in Threads state `update()` and `bulk()` commands. ## v0.2.45 * Enhanced streaming feature by incorporating tracing contexts. * Removed an unnecessary query from the Crons.search function. * Resolved connection reuse issue when scheduling next run for multiple cron jobs. * Removed an unnecessary query in the Crons.search function to improve efficiency. * Resolved an issue with scheduling the next cron run by improving connection reuse. ## v0.2.44 * Enhanced the worker logic to exit the pipeline before continuing when the Redis message limit is reached. * Introduced a ceiling for Redis message size with an option to skip messages larger than 128 MB for improved performance. * Ensured the pipeline always closes properly to prevent resource leaks. ## v0.2.43 * Improved performance by omitting logs in metadata calls and ensuring output schema compliance in value streaming. * Ensured the connection is properly closed after use. * Aligned output format to strictly adhere to the specified schema. * Stopped sending internal logs in metadata requests to improve privacy. ## v0.2.42 * Added timestamps to track the start and end of a request's run. * Added tracer information to the configuration settings. * Added support for streaming with tracing contexts. ## v0.2.41 * Added locking mechanism to prevent errors in pipelined executions. *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-server-changelog.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Configure LangSmith Agent Server for scale Source: https://docs.langchain.com/langsmith/agent-server-scale The default configuration for LangSmith Agent Server is designed to handle substantial read and write load across a variety of different workloads. By following the best practices outlined below, you can tune your Agent Server to perform optimally for your specific workload. This page describes scaling considerations for the Agent Server and provides examples to help configure your deployment. For some example self-hosted configurations, refer to the [Example Agent Server configurations for scale](#example-agent-server-configurations-for-scale) section. ## Scaling for write load Write load is primarily driven by the following factors: * Creation of new [runs](/langsmith/background-run) * Creation of new checkpoints during run execution * Writing to long term memory * Creation of new [threads](/langsmith/use-threads) * Creation of new [assistants](/langsmith/assistants) * Deletion of runs, checkpoints, threads, assistants and cron jobs The following components are primarily responsible for handling write load: * API server: Handles initial request and persistence of data to the database. * Queue worker: Handles the execution of runs. * Redis: Handles the storage of ephemeral data about on-going runs. * Postgres: Handles the storage of all data, including run, thread, assistant, cron job, checkpointing and long term memory. ### Best practices for scaling the write path #### Change `N_JOBS_PER_WORKER` based on assistant characteristics The default value of [`N_JOBS_PER_WORKER`](/langsmith/env-var#n-jobs-per-worker) is 10. You can change this value to scale the maximum number of runs that can be executed at a time by a single queue worker based on the characteristics of your assistant. Some general guidelines for changing `N_JOBS_PER_WORKER`: * If your assistant is CPU bounded, the default value of 10 is likely sufficient. You might lower `N_JOBS_PER_WORKER` if you notice excessive CPU usage on queue workers or delays in run execution. * If your assistant is IO bounded, increase `N_JOBS_PER_WORKER` to handle more concurrent runs per worker. There is no upper limit to `N_JOBS_PER_WORKER`. However, queue workers are greedy when fetching new runs, which means they will try to pick up as many runs as they have available jobs and begin executing them immediately. Setting `N_JOBS_PER_WORKER` too high in environments with bursty traffic can lead to uneven worker utilization and increased run execution times. #### Avoid synchronous blocking operations Avoid synchronous blocking operations in your code and prefer asynchronous operations. Long synchronous operations can block the main event loop, causing longer request and run execution times and potential timeouts. For example, consider an application that needs to sleep for 1 second. Instead of using synchronous code like this: ```python theme={null} import time def my_function(): time.sleep(1) ``` Prefer asynchronous code like this: ```python theme={null} import asyncio async def my_function(): await asyncio.sleep(1) ``` If an assistant requires synchronous blocking operations, set [`BG_JOB_ISOLATED_LOOPS`](/langsmith/env-var#bg-job-isolated-loops) to `True` to execute each run in a separate event loop. #### Minimize redundant checkpointing Minimize redundant checkpointing by setting [`durability`](/oss/python/langgraph/durable-execution#durability-modes) to the minimum value necessary to ensure your data is durable. The default durability mode is `"async", meaning checkpoints are written after each step asynchronously. If an assistant needs to persist only the final state of the run, `durability`can be set to`"exit"\`, storing only the final state of the run. This can be set when creating the run: ```python theme={null} from langgraph_sdk import get_client client = get_client(url=) thread = await client.threads.create() run = await client.runs.create( thread_id=thread["thread_id"], assistant_id="agent", durability="exit" ) ``` #### Self-hosted These settings are only required for [self-hosted](/langsmith/self-hosted) deployments. By default, [cloud](/langsmith/cloud) deployments already have these best practices enabled. ##### Enable the use of queue workers By default, the API server manages the queue and does not use queue workers. You can enable the use of queue workers by setting the `queue.enabled` configuration to `true`. ```yaml theme={null} queue: enabled: true ``` This will allow the API server to offload the queue management to the queue workers, significantly reducing the load on the API server and allowing it to focus on handling requests. ##### Support a number of jobs equal to expected throughput The more runs you execute in parallel, the more jobs you will need to handle the load. There are two main parameters to scale the available jobs: * `number_of_queue_workers`: The number of queue workers provisioned. * `N_JOBS_PER_WORKER`: The number of runs that a single queue work can execute at a time. Defaults to 10. You can calculate the available jobs with the following equation: ``` available_jobs = number_of_queue_workers * `N_JOBS_PER_WORKER` ``` Throughput is then the number of runs that can be executed per second by the available jobs: ``` throughput_per_second = available_jobs / average_run_execution_time_seconds ``` Therefore, the minimum number of queue workers you should provision to support your expected steady state throughput is: ``` number_of_queue_workers = throughput_per_second * average_run_execution_time_seconds / `N_JOBS_PER_WORKER` ``` ##### Configure autoscaling for bursty workloads Autoscaling is disabled by default, but should be configured for bursty workloads. Using the same calculations as the [previous section](#support-a-number-of-jobs-equal-to-expected-throughput), you can determine the maximum number of queue workers you should allow the autoscaler to scale to based on maximum expected throughput. ## Scaling for read load Read load is primarily driven by the following factors: * Getting the results of a [run](/langsmith/background-run) * Getting the state of a [thread](/langsmith/use-threads) * Searching for [runs](/langsmith/background-run), [threads](/langsmith/use-threads), [cron jobs](/langsmith/cron-jobs) and [assistants](/langsmith/assistants) * Retrieving checkpoints and long term memory The following components are primarily responsible for handling read load: * API server: Handles the request and direct retrieval of data from the database. * Postgres: Handles the storage of all data, including run, thread, assistant, cron job, checkpointing and long term memory. * Redis: Handles the storage of ephemeral data about on-going runs, including streaming messages from queue workers to api servers. ### Best practices for scaling the read path #### Use filtering to reduce the number of resources returned per request [Agent Server](/langsmith/agent-server) provides a search API for each resource type. These APIs implement pagination by default and offer many filtering options. Use filtering to reduce the number of resources returned per request and improve performance. #### Set a TTLs to automatically delete old data Set a [TTL on threads](/langsmith/configure-ttl) to automatically clean up old data. Runs and checkpoints are automatically deleted when the associated thread is deleted. #### Avoid polling and use /join to monitor the state of a run Avoid polling the state of a run by using the `/join` API endpoint. This method returns the final state of the run once the run is complete. If you need to monitor the output of a run in real-time, use the `/stream` API endpoint. This method streams the run output including the final state of the run. #### Self-hosted These settings are only required for [self-hosted](/langsmith/self-hosted) deployments. By default, [cloud](/langsmith/cloud) deployments already have these best practices enabled. ##### Configure autoscaling for bursty workloads Autoscaling is disabled by default, but should be configured for bursty workloads. You can determine the maximum number of api servers you should allow the autoscaler to scale to based on maximum expected throughput. The default for [cloud](/langsmith/cloud) deployments is a maximum of 10 API servers. ## Example self-hosted Agent Server configurations The exact optimal configuration depends on your application complexity, request patterns, and data requirements. Use the following examples in combination with the information in the previous sections and your specific usage to update your deployment configuration as needed. If you have any questions, reach out to the LangChain team at [support@langchain.dev](mailto:support@langchain.dev). The following table provides an overview comparing different LangSmith Agent Server configurations for various load patterns (read requests per second / write requests per second) and standard assistant characteristics (average run execution time of 1 second, moderate CPU and memory usage): | | **[Low / low](#low-reads-low-writes)** | **[Low / high](#low-reads-high-writes)** | **[High / low](#high-reads-low-writes)** | [Medium / medium](#medium-reads-medium-writes) | [High / high](#high-reads-high-writes) | | :----------------------------------------------------------------------------------------------------------------------- | :------------------------------------- | :--------------------------------------- | :--------------------------------------- | :--------------------------------------------- | :------------------------------------- | | Write requests per second | 5 | 5 | 500 | 50 | 500 | | Read requests per second | 5 | 500 | 5 | 50 | 500 | | **API servers**
(1 CPU, 2Gi per server) | 1 (default) | 6 | 10 | 3 | 15 | | **Queue workers**
(1 CPU, 2Gi per worker) | 1 (default) | 10 | 1 (default) | 5 | 10 | | **`N_JOBS_PER_WORKER`** | 10 (default) | 50 | 10 | 10 | 50 | | **Redis resources** | 2 Gi (default) | 2 Gi (default) | 2 Gi (default) | 2 Gi (default) | 2 Gi (default) | | **Postgres resources** | 2 CPU
8 Gi (default) | 4 CPU
16 Gi memory | 4 CPU
16 Gi | 4 CPU
16 Gi memory | 8 CPU
32 Gi memory | The following sample configurations enable each of these setups. Load levels are defined as: * Low means approximately 5 requests per second * Medium means approximately 50 requests per second * High means approximately 500 requests per second ### Low reads, low writes
The default [LangSmith Deployment](/langsmith/deployments) configuration will handle this load. No custom resource configuration is needed here. ### Low reads, high writes You have a high volume of write requests (500 per second) being processed by your deployment, but relatively few read requests (5 per second). For this, we recommend a configuration like this: ```yaml theme={null} # Example configuration for low reads, high writes (5 read/500 write requests per second) api: replicas: 6 resources: requests: cpu: "1" memory: "2Gi" limits: cpu: "2" memory: "4Gi" queue: replicas: 10 resources: requests: cpu: "1" memory: "2Gi" limits: cpu: "2" memory: "4Gi" config: numberOfJobsPerWorker: 50 redis: resources: requests: memory: "2Gi" limits: memory: "2Gi" postgres: resources: requests: cpu: "4" memory: "16Gi" limits: cpu: "8" memory: "32Gi" ``` ### High reads, low writes You have a high volume of read requests (500 per second) but relatively few write requests (5 per second). For this, we recommend a configuration like this: ```yaml theme={null} # Example configuration for high reads, low writes (500 read/5 write requests per second) api: replicas: 10 resources: requests: cpu: "1" memory: "2Gi" limits: cpu: "2" memory: "4Gi" queue: replicas: 1 # Default, minimal write load resources: requests: cpu: "1" memory: "2Gi" limits: cpu: "2" memory: "4Gi" redis: resources: requests: memory: "2Gi" limits: memory: "2Gi" postgres: resources: requests: cpu: "4" memory: "16Gi" limits: cpu: "8" memory: "32Gi" # Consider read replicas for high read scenarios readReplicas: 2 ``` ### Medium reads, medium writes This is a balanced configuration that should handle moderate read and write loads (50 read/50 write requests per second). For this, we recommend a configuration like this: ```yaml theme={null} # Example configuration for medium reads, medium writes (50 read/50 write requests per second) api: replicas: 3 resources: requests: cpu: "1" memory: "2Gi" limits: cpu: "2" memory: "4Gi" queue: replicas: 5 resources: requests: cpu: "1" memory: "2Gi" limits: cpu: "2" memory: "4Gi" redis: resources: requests: memory: "2Gi" limits: memory: "2Gi" postgres: resources: requests: cpu: "4" memory: "16Gi" limits: cpu: "8" memory: "32Gi" ``` ### High reads, high writes You have high volumes of both read and write requests (500 read/500 write requests per second). For this, we recommend a configuration like this: ```yaml theme={null} # Example configuration for high reads, high writes (500 read/500 write requests per second) api: replicas: 15 resources: requests: cpu: "1" memory: "2Gi" limits: cpu: "2" memory: "4Gi" queue: replicas: 10 resources: requests: cpu: "1" memory: "2Gi" limits: cpu: "2" memory: "4Gi" config: numberOfJobsPerWorker: 50 redis: resources: requests: memory: "2Gi" limits: memory: "2Gi" postgres: resources: requests: cpu: "8" memory: "32Gi" limits: cpu: "16" memory: "64Gi" ``` ### Autoscaling If your deployment experiences bursty traffic, you can enable autoscaling to scale the number of API servers and queue workers to handle the load. Here is a sample configuration for autoscaling for high reads and high writes: ```yaml theme={null} api: autoscaling: enabled: true minReplicas: 15 maxReplicas: 25 queue: autoscaling: enabled: true minReplicas: 10 maxReplicas: 20 ``` Ensure that your deployment environment has sufficient resources to scale to the recommended size. Monitor your applications and infrastructure to ensure optimal performance. Consider implementing monitoring and alerting to track resource usage and application performance. *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-server-scale.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Alerts in LangSmith Source: https://docs.langchain.com/langsmith/alerts **Self-hosted Version Requirement** Access to alerts requires Helm chart version **0.10.3** or later. ## Overview Effective observability in LLM applications requires proactive detection of failures, performance degradations, and regressions. LangSmith's alerts feature helps identify critical issues such as: * API rate limit violations from model providers * Latency increases for your application * Application changes that affect feedback scores reflecting end-user experience Alerts in LangSmith are project-scoped, requiring separate configuration for each monitored project. ## Configuring an alert ### Step 1: Navigate To Create Alert First navigate to the Tracing project that you would like to configure alerts for. Click the Alerts icon on the top right hand corner of the page to view existing alerts for that project and set up a new alert. ### Step 2: Select Metric Type
Alert Metrics
LangSmith offers threshold-based alerting on three core metrics: | Metric Type | Description | Use Case | | ------------------ | ----------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------- | | **Errored Runs** | Track runs with an error status | Monitors for failures in an application. | | **Feedback Score** | Measures the average feedback score | Track [feedback from end users](/langsmith/attach-user-feedback) or [online evaluation results](/langsmith/online-evaluations) to alert on regressions. | | **Latency** | Measures average run execution time | Tracks the latency of your application to alert on spikes and performance bottlenecks. | Additionally, for **Errored Runs** and **Run Latency**, you can define filters to narrow down the runs that trigger alerts. For example, you might create an error alert filter for all `llm` runs tagged with `support_agent` that encounter a `RateLimitExceeded` error.
Alert Metrics
### Step 2: Define Alert Conditions Alert conditions consist of several components: * **Aggregation Method**: Average, Percentage, or Count * **Comparison Operator**: `>=`, `<=`, or exceeds threshold * **Threshold Value**: Numerical value triggering the alert * **Aggregation Window**: Time period for metric calculation (currently choose between 5 or 15 minutes) * **Feedback Key** (Feedback Score alerts only): Specific feedback metric to monitor
Alert Condition Configuration
**Example:** The configuration shown above would generate an alert when more than 5% of runs within the past 5 minutes result in errors. You can preview alert behavior over a historical time window to understand how many datapoints—and which ones—would have triggered an alert at a chosen threshold (indicated in red). For example, setting an average latency threshold of 60 seconds for a project lets you visualize potential alerts, as shown in the image below.
Alert Metrics
### Step 3: Configure Notification Channel LangSmith supports the following notification channels: 1. [PagerDuty Integration](/langsmith/alerts-pagerduty) 2. [Webhook Notifications](/langsmith/alerts-webhook) Select the appropriate channel to ensure notifications reach the responsible team members. ## Best Practices * Adjust sensitivity based on application criticality * Start with broader thresholds and refine based on observed patterns * Ensure alert routing reaches appropriate on-call personnel *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/alerts.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Configure webhook notifications for LangSmith alerts Source: https://docs.langchain.com/langsmith/alerts-webhook ## Overview This guide details the process for setting up webhook notifications for [LangSmith alerts](/langsmith/alerts). Before proceeding, make sure you have followed the steps leading up to the notification step of creating the alert by following [this guide](./alerts). Webhooks enable integration with custom services and third-party platforms by sending HTTP POST requests when alert conditions are triggered. Use webhooks to forward alert data to ticketing systems, chat applications, or custom monitoring solutions. ## Prerequisites * An endpoint that can receive HTTP POST requests * Appropriate authentication credentials for your receiving service (if required) ## Integration Configuration ### Step 1: Prepare Your Receiving Endpoint Before configuring the webhook in LangSmith, ensure your receiving endpoint: * Accepts HTTP POST requests * Can process JSON payloads * Is accessible from external services * Has appropriate authentication mechanisms (if required) Additionally, if on a custom deployment of LangSmith, make sure there are no firewall settings blocking egress traffic from LangSmith services. ### Step 2: Configure Webhook Parameters Webhook Setup In the notification section of your alert complete the webhook configuration with the following parameters: **Required Fields** * **URL**: The complete URL of your receiving endpoint * Example: `https://api.example.com/incident-webhook` **Optional Fields** * **Headers**: JSON Key-value pairs sent with the webhook request * Common headers include: * `Authorization`: For authentication tokens * `Content-Type`: Usually set to `application/json` (default) * `X-Source`: To identify the source as LangSmith * If no headers, then simply use `{}` * **Request Body Template**: Customize the JSON payload sent to your endpoint * Default: LangSmith sends the payload defined and the following additonal key-value pairs appended to the payload: * `project_name`: Name of the triggered alert * `alert_rule_id`: A UUID to identify the LangSmith alert. This can be used as a de-duplication key in the webhook service. * `alert_rule_name`: The name of the alert rule. * `alert_rule_type`: The type of alert (as of 04/01/2025 all alerts are of type `threshold`). * `alert_rule_attribute`: The attribute associated with the alert rule - `error_count`, `feedback_score` or `latency`. * `triggered_metric_value`: The value of the metric at the time the threshold was triggered. * `triggered_threshold`: The threshold that triggered the alert. * `timestamp`: The timestamp that triggered the alert. ### Step 3: Test the Webhook Click **Send Test Alert** to send the webhook notification to ensure the notification works as intended. ## Troubleshooting If webhook notifications aren't being delivered: * Verify the webhook URL is correct and accessible * Ensure any authentication headers are properly formatted * Check that your receiving endpoint accepts POST requests * Examine your endpoint's logs for received but rejected requests * Verify your custom payload template is valid JSON format ## Security Considerations * Use HTTPS for your webhook endpoints * Implement authentication for your webhook endpoint * Consider adding a shared secret in your headers to verify webhook sources * Validate incoming webhook requests before processing them ## Sending alerts to Slack using a webhook Here is an example for configuring LangSmith alerts to send notifications to Slack channels using the [`chat.postMessage`](https://api.slack.com/methods/chat.postMessage) API. ### Prerequisites * Access to a Slack workspace * A LangSmith project to set up alerts * Permissions to create Slack applications ### Step 1: Create a Slack App 1. Visit the [Slack API Applications page](https://api.slack.com/apps) 2. Click **Create New App** 3. Select **From scratch** 4. Provide an **App Name** (e.g., "LangSmith Alerts") 5. Select the workspace where you want to install the app 6. Click **Create App** ### Step 2: Configure Bot Permissions 1. In the left sidebar of your Slack app configuration, click **OAuth & Permissions** 2. Scroll down to **Bot Token Scopes** under **Scopes** and click **Add an OAuth Scope** 3. Add the following scopes: * `chat:write` (Send messages as the app) * `chat:write.public` (Send messages to channels the app isn't in) * `channels:read` (View basic channel information) ### Step 3: Install the App to Your Workspace 1. Scroll up to the top of the **OAuth & Permissions** page 2. Click **Install to Workspace** 3. Review the permissions and click **Allow** 4. Copy the **Bot User OAuth Token** that appears (begins with `xoxb-`) ### Step 4: Configure the Webhook Alert in LangSmith 1. In LangSmith, navigate to your project 2. Select **Alerts → Create Alert** 3. Define your alert metrics and conditions 4. In the notification section, select **Webhook** 5. Configure the webhook with the following settings: **Webhook URL** ```json theme={null} https://slack.com/api/chat.postMessage ``` **Headers** ```json theme={null} { "Content-Type": "application/json", "Authorization": "Bearer xoxb-your-token-here" } ``` > **Note:** Replace `xoxb-your-token-here` with your actual Bot User OAuth Token **Request Body Template** ```json theme={null} { "channel": "{channel_id}", "text": "{alert_name} triggered for {project_name}", "blocks": [ { "type": "section", "text": { "type": "mrkdwn", "text": "🚨{alert_name} has been triggered" } }, { "type": "section", "text": { "type": "mrkdwn", "text": "Please check the following link for more information:" } }, { "type": "section", "text": { "type": "mrkdwn", "text": "<{project-url}|View in LangSmith>" } } ] } ``` **NOTE:** Fill in the `channel_id`, `alert_name`, `project_name` and `project_url` when creating the alert. You can find your `project_url` in the browser's URL bar. Copy the portion up to but not including any query parameters. 6. Click **Save** to activate the webhook configuration ### Step 5: Test the Integration 1. In the LangSmith alert configuration, click **Test Alert** 2. Check your specified Slack channel for the test notification 3. Verify that the message contains the expected alert information ### (Optional) Step 6: Link to the Alert Preview in the Request Body After creating an alert, you can optionally link to its preview in the webhook's request body. Alert Preview Pane To configure this: 1. Save your alert 2. Find your saved alert in the alerts table and click it 3. Copy the dsiplayed URL 4. Click "Edit Alert" 5. Replace the existing project URL with the copied alert preview URL ## Additional Resources * [LangSmith Alerts Documentation](/langsmith/alerts) * [Slack chat.postMessage API Documentation](https://api.slack.com/methods/chat.postMessage) * [Slack Block Kit Builder](https://app.slack.com/block-kit-builder/) *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/alerts-webhook.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Analyze an experiment Source: https://docs.langchain.com/langsmith/analyze-an-experiment This page describes some of the essential tasks for working with [*experiments*](/langsmith/evaluation-concepts#experiment) in LangSmith: * **[Analyze a single experiment](#analyze-a-single-experiment)**: View and interpret experiment results, customize columns, filter data, and compare runs. * **[Download experiment results as a CSV](#how-to-download-experiment-results-as-a-csv)**: Export your experiment data for external analysis and sharing. * **[Rename an experiment](#how-to-rename-an-experiment)**: Update experiment names in both the Playground and Experiments view. ## Analyze a single experiment After running an experiment, you can use LangSmith's experiment view to analyze the results and draw insights about your experiment's performance. ### Open the experiment view To open the experiment view, select the relevant [*dataset*](/langsmith/evaluation-concepts#datasets) from the **Dataset & Experiments** page and then select the experiment you want to view. Open experiment view ### View experiment results #### Customize columns By default, the experiment view shows the input, output, and reference output for each [example](/langsmith/evaluation-concepts#examples) in the dataset, feedback scores from evaluations and experiment metrics like cost, token counts, latency and status. You can customize the columns using the **Display** button to make it easier to interpret experiment results: * **Break out fields from inputs, outputs, and reference outputs** into their own columns. This is especially helpful if you have long inputs/outputs/reference outputs and want to surface important fields. * **Hide and reorder columns** to create focused views for analysis. * **Control decimal precision on feedback scores**. By default, LangSmith surfaces numerical feedback scores with a decimal precision of 2, but you can customize this setting to be up to 6 decimals. * **Set the Heat Map threshold** to high, middle, and low for numeric feedback scores in your experiment, which affects the threshold at which score chips render as red or green: Column heatmap configuration You can set default configurations for an entire dataset or temporarily save settings just for yourself. #### Sort and filter To sort or filter feedback scores, you can use the actions in the column headers. Sort and filter #### Table views Depending on the view most useful for your analysis, you can change the formatting of the table by toggling between a compact view, a full, view, and a diff view. * The **Compact** view shows each run as a one-line row, for ease of comparing scores at a glance. * The **Full** view shows the full output for each run for digging into the details of individual runs. * The **Diff** view shows the text difference between the reference output and the output for each run. Diff view #### View the traces Hover over any of the output cells, and click on the trace icon to view the trace for that run. This will open up a trace in the side panel. To view the entire tracing project, click on the **View Project** button in the top right of the header. View trace #### View evaluator runs For evaluator scores, you can view the source run by hovering over the evaluator score cell and clicking on the arrow icon. This will open up a trace in the side panel. If you're running a [LLM-as-a-judge evaluator](/langsmith/llm-as-judge), you can view the prompt used for the evaluator in this run. If your experiment has [repetitions](/langsmith/evaluation-concepts#repetitions), you can click on the aggregate average score to find links to all of the individual runs. View evaluator runs ### Group results by metadata You can add metadata to examples to categorize and organize them. For example, if you're evaluating factual accuracy on a question answering dataset, the metadata might include which subject area each question belongs to. Metadata can be added either [via the UI](/langsmith/manage-datasets-in-application#edit-example-metadata) or [via the SDK](/langsmith/manage-datasets-programmatically#update-single-example). To analyze results by metadata, use the **Group by** dropdown in the top right corner of the experiment view and select your desired metadata key. This displays average feedback scores, latency, total tokens, and cost for each metadata group. You will only be able to group by example metadata on experiments created after February 20th, 2025. Any experiments before that date can still be grouped by metadata, but only if the metadata is on the experiment traces themselves. ### Repetitions If you've run your experiment with [*repetitions*](/langsmith/evaluation-concepts#repetitions), there will be arrows in the output results column so you can view outputs in the table. To view each run from the repetition, hover over the output cell and click the expanded view. When you run an experiment with repetitions, LangSmith displays the average for each feedback score in the table. Click on the feedback score to view the feedback scores from individual runs, or to view the standard deviation across repetitions. Repetitions ### Compare to another experiment In the top right of the experiment view, you can select another experiment to compare to. This will open up a comparison view, where you can see how the two experiments compare. To learn more about the comparison view, see [how to compare experiment results](/langsmith/compare-experiment-results). ## Download experiment results as a CSV LangSmith lets you download experiment results as a CSV file, which allows you to analyze and share your results. To download as a CSV, click the download icon at the top of the experiment view. The icon is directly to the left of the [Compact toggle](/langsmith/compare-experiment-results#adjust-the-table-display). Download CSV ## Rename an experiment Experiment names must be unique per workspace. You can rename an experiment in the LangSmith UI in: * The [Playground](#renaming-an-experiment-in-the-playground). When running experiments in the Playground, a default name with the format `pg::prompt-name::model::uuid` (eg. `pg::gpt-4o-mini::897ee630`) is automatically assigned. You can rename an experiment immediately after running it by editing its name in the Playground table header. Edit name in playground * The [Experiments view](#renaming-an-experiment-in-the-experiments-view). When viewing results in the experiments view, you can rename an experiment by using the pencil icon beside the experiment name. Edit name in experiments view *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/analyze-an-experiment.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Custom instrumentation Source: https://docs.langchain.com/langsmith/annotate-code If you've decided you no longer want to trace your runs, you can remove the `LANGSMITH_TRACING` environment variable. Note that this does not affect the `RunTree` objects or API users, as these are meant to be low-level and not affected by the tracing toggle. There are several ways to log traces to LangSmith. If you are using LangChain (either Python or JS/TS), you can skip this section and go directly to the [LangChain-specific instructions](/langsmith/trace-with-langchain). ## Use `@traceable` / `traceable` LangSmith makes it easy to log traces with minimal changes to your existing code with the `@traceable` decorator in Python and `traceable` function in TypeScript. The `LANGSMITH_TRACING` environment variable must be set to `'true'` in order for traces to be logged to LangSmith, even when using `@traceable` or `traceable`. This allows you to toggle tracing on and off without changing your code. Additionally, you will need to set the `LANGSMITH_API_KEY` environment variable to your API key (see [Setup](/) for more information). By default, the traces will be logged to a project named `default`. To log traces to a different project, see [this section](/langsmith/log-traces-to-project). The `@traceable` decorator is a simple way to log traces from the LangSmith Python SDK. Simply decorate any function with `@traceable`. Note that when wrapping a sync function with `traceable`, (e.g. `formatPrompt` in the example below), you should use the `await` keyword when calling it to ensure the trace is logged correctly. ```python Python theme={null} from langsmith import traceable from openai import Client openai = Client() @traceable def format_prompt(subject): return [ { "role": "system", "content": "You are a helpful assistant.", }, { "role": "user", "content": f"What's a good name for a store that sells {subject}?" } ] @traceable(run_type="llm") def invoke_llm(messages): return openai.chat.completions.create( messages=messages, model="gpt-4o-mini", temperature=0 ) @traceable def parse_output(response): return response.choices[0].message.content @traceable def run_pipeline(): messages = format_prompt("colorful socks") response = invoke_llm(messages) return parse_output(response) run_pipeline() ``` ```typescript TypeScript theme={null} import { traceable } from "langsmith/traceable"; import OpenAI from "openai"; const openai = new OpenAI(); const formatPrompt = traceable((subject: string) => { return [ { role: "system" as const, content: "You are a helpful assistant.", }, { role: "user" as const, content: `What's a good name for a store that sells ${subject}?`, }, ]; },{ name: "formatPrompt" }); const invokeLLM = traceable( async ({ messages }: { messages: { role: string; content: string }[] }) => { return openai.chat.completions.create({ model: "gpt-4o-mini", messages: messages, temperature: 0, }); }, { run_type: "llm", name: "invokeLLM" } ); const parseOutput = traceable( (response: any) => { return response.choices[0].message.content; }, { name: "parseOutput" } ); const runPipeline = traceable( async () => { const messages = await formatPrompt("colorful socks"); const response = await invokeLLM({ messages }); return parseOutput(response); }, { name: "runPipeline" } ); await runPipeline(); ``` ## Use the `trace` context manager (Python only) In Python, you can use the `trace` context manager to log traces to LangSmith. This is useful in situations where: 1. You want to log traces for a specific block of code. 2. You want control over the inputs, outputs, and other attributes of the trace. 3. It is not feasible to use a decorator or wrapper. 4. Any or all of the above. The context manager integrates seamlessly with the `traceable` decorator and `wrap_openai` wrapper, so you can use them together in the same application. ```python theme={null} import openai import langsmith as ls from langsmith.wrappers import wrap_openai client = wrap_openai(openai.Client()) @ls.traceable(run_type="tool", name="Retrieve Context") def my_tool(question: str) -> str: return "During this morning's meeting, we solved all world conflict." def chat_pipeline(question: str): context = my_tool(question) messages = [ { "role": "system", "content": "You are a helpful assistant. Please respond to the user's request only based on the given context." }, { "role": "user", "content": f"Question: {question}\nContext: {context}"} ] chat_completion = client.chat.completions.create( model="gpt-4o-mini", messages=messages ) return chat_completion.choices[0].message.content app_inputs = {"input": "Can you summarize this morning's meetings?"} with ls.trace("Chat Pipeline", "chain", project_name="my_test", inputs=app_inputs) as rt: output = chat_pipeline("Can you summarize this morning's meetings?") rt.end(outputs={"output": output}) ``` ## Use the `RunTree` API Another, more explicit way to log traces to LangSmith is via the `RunTree` API. This API allows you more control over your tracing - you can manually create runs and children runs to assemble your trace. You still need to set your `LANGSMITH_API_KEY`, but `LANGSMITH_TRACING` is not necessary for this method. This method is not recommended, as it's easier to make mistakes in propagating trace context. ```python Python theme={null} import openai from langsmith.run_trees import RunTree # This can be a user input to your app question = "Can you summarize this morning's meetings?" # Create a top-level run pipeline = RunTree( name="Chat Pipeline", run_type="chain", inputs={"question": question} ) pipeline.post() # This can be retrieved in a retrieval step context = "During this morning's meeting, we solved all world conflict." messages = [ { "role": "system", "content": "You are a helpful assistant. Please respond to the user's request only based on the given context." }, { "role": "user", "content": f"Question: {question}\nContext: {context}"} ] # Create a child run child_llm_run = pipeline.create_child( name="OpenAI Call", run_type="llm", inputs={"messages": messages}, ) child_llm_run.post() # Generate a completion client = openai.Client() chat_completion = client.chat.completions.create( model="gpt-4o-mini", messages=messages ) # End the runs and log them child_llm_run.end(outputs=chat_completion) child_llm_run.patch() pipeline.end(outputs={"answer": chat_completion.choices[0].message.content}) pipeline.patch() ``` ```typescript TypeScript theme={null} import OpenAI from "openai"; import { RunTree } from "langsmith"; // This can be a user input to your app const question = "Can you summarize this morning's meetings?"; const pipeline = new RunTree({ name: "Chat Pipeline", run_type: "chain", inputs: { question } }); await pipeline.postRun(); // This can be retrieved in a retrieval step const context = "During this morning's meeting, we solved all world conflict."; const messages = [ { role: "system", content: "You are a helpful assistant. Please respond to the user's request only based on the given context." }, { role: "user", content: `Question: ${question}Context: ${context}` } ]; // Create a child run const childRun = await pipeline.createChild({ name: "OpenAI Call", run_type: "llm", inputs: { messages }, }); await childRun.postRun(); // Generate a completion const client = new OpenAI(); const chatCompletion = await client.chat.completions.create({ model: "gpt-4o-mini", messages: messages, }); // End the runs and log them childRun.end(chatCompletion); await childRun.patchRun(); pipeline.end({ outputs: { answer: chatCompletion.choices[0].message.content } }); await pipeline.patchRun(); ``` ## Example usage You can extend the utilities above to conveniently trace any code. Below are some example extensions: Trace any public method in a class: ```python theme={null} from typing import Any, Callable, Type, TypeVar T = TypeVar("T") def traceable_cls(cls: Type[T]) -> Type[T]: """Instrument all public methods in a class.""" def wrap_method(name: str, method: Any) -> Any: if callable(method) and not name.startswith("__"): return traceable(name=f"{cls.__name__}.{name}")(method) return method # Handle __dict__ case for name in dir(cls): if not name.startswith("_"): try: method = getattr(cls, name) setattr(cls, name, wrap_method(name, method)) except AttributeError: # Skip attributes that can't be set (e.g., some descriptors) pass # Handle __slots__ case if hasattr(cls, "__slots__"): for slot in cls.__slots__: # type: ignore[attr-defined] if not slot.startswith("__"): try: method = getattr(cls, slot) setattr(cls, slot, wrap_method(slot, method)) except AttributeError: # Skip slots that don't have a value yet pass return cls @traceable_cls class MyClass: def __init__(self, some_val: int): self.some_val = some_val def combine(self, other_val: int): return self.some_val + other_val # See trace: https://smith.langchain.com/public/882f9ecf-5057-426a-ae98-0edf84fdcaf9/r MyClass(13).combine(29) ``` ## Ensure all traces are submitted before exiting LangSmith's tracing is done in a background thread to avoid obstructing your production application. This means that your process may end before all traces are successfully posted to LangSmith. Here are some options for ensuring all traces are submitted before exiting your application. ### Using the LangSmith SDK If you are using the LangSmith SDK standalone, you can use the `flush` method before exit: ```python Python theme={null} from langsmith import Client client = Client() @traceable(client=client) async def my_traced_func(): # Your code here... pass try: await my_traced_func() finally: await client.flush() ``` ```typescript TypeScript theme={null} import { Client } from "langsmith"; const langsmithClient = new Client({}); const myTracedFunc = traceable(async () => { // Your code here... },{ client: langsmithClient }); try { await myTracedFunc(); } finally { await langsmithClient.flush(); } ``` ### Using LangChain If you are using LangChain, please refer to our [LangChain tracing guide](/langsmith/trace-with-langchain#ensure-all-traces-are-submitted-before-exiting). If you prefer a video tutorial, check out the [Tracing Basics video](https://academy.langchain.com/pages/intro-to-langsmith-preview) from the Introduction to LangSmith Course. *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/annotate-code.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Annotate traces and runs inline Source: https://docs.langchain.com/langsmith/annotate-traces-inline LangSmith allows you to manually annotate traces with feedback within the application. This can be useful for adding context to a trace, such as a user's comment or a note about a specific issue. You can annotate a trace either inline or by sending the trace to an annotation queue, which allows you to closely inspect and log feedbacks to runs one at a time. Feedback tags are associated with your [workspace](/langsmith/administration-overview#workspaces). **You can attach user feedback to ANY intermediate run (span) of the trace, not just the root span.** This is useful for critiquing specific parts of the LLM application, such as the retrieval step or generation step of the RAG pipeline. To annotate a trace inline, click on the `Annotate` in the upper right corner of trace view for any particular run that is part of the trace. This will open up a pane that allows you to choose from feedback tags associated with your workspace and add a score for particular tags. You can also add a standalone comment. Follow [this guide](./set-up-feedback-criteria) to set up feedback tags for your workspace. You can also set up new feedback criteria from within the pane itself. You can use the labeled keyboard shortcuts to streamline the annotation process. *** [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/annotate-traces-inline.mdx) [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers. # Use annotation queues Source: https://docs.langchain.com/langsmith/annotation-queues *Annotation queues* provide a streamlined, directed view for human annotators to attach feedback to specific [runs](/langsmith/observability-concepts#runs). While you can always annotate [traces](/langsmith/observability-concepts#traces) inline, annotation queues provide another option to group runs together, then have annotators review and provide [feedback](/langsmith/observability-concepts#feedback) on them. ## Create an annotation queue To create an annotation queue: 1. Navigate to the **Annotation queues** section on the left-hand navigation panel of the [LangSmith UI](https://smith.langchain.com). 2. Click **+ New annotation queue** in the top right corner. Create Annotation Queue form with Basic Details, Annotation Rubric, and Feedback sections. ### Basic Details 1. Fill in the form with the **Name** and **Description** of the queue. You can also assign a **default dataset** to queue, which will streamline the process of sending the inputs and outputs of certain runs to datasets in your LangSmith [workspace](/langsmith/administration-overview#workspaces). ### Annotation Rubric 1. Draft some high-level instructions for your annotators, which will be shown in the sidebar on every run. 2. Click **+ Desired Feedback** to add feedback keys to your annotation queue. Annotators will be presented with these feedback keys on each run. 3. Add a description for each, as well as a short description of each category, if the feedback is categorical. Annotation queue rubric form with instructions and desired feedback entered. For example, with the descriptions in the previous screenshot, reviewers will see the **Annotation Rubric** details in the right-hand pane of the UI. The rendered rubric for reviewers from the example instructions. ### Collaborator Settings When there are multiple annotators for a run: * **Number of reviewers per run**: This determines the number of reviewers that must mark a run as **Done** for it to be removed from the queue. If you check **All workspace members review each run**, then a run will remain in the queue until all [workspace](/langsmith/administration-overview#workspaces) members have marked their review as **Done**. * Reviewers cannot view the feedback left by other reviewers. * Comments on runs are visible to all reviewers. * **Enable reservations on runs**: When a reviewer views a run, the run is reserved for that reviewer for the specified **Reservation length**. If there are multiple reviewers per run as specified above, the run can be reserved by multiple reviewers (up to the number of reviewers per run) at the same time. We recommend enabling reservations. This will prevent multiple annotators from reviewing the same run at the same time. If a reviewer has viewed a run and then leaves the run without marking it **Done**, the reservation will expire after the specified **Reservation length**. The run is then released back into the queue and can be reserved by another reviewer. Clicking **Requeue** for a run's annotation will only move the current run to the end of the current user's queue; it won't affect the queue order of any other user. It will also release the reservation that the current user has on that run. As a result of the **Collaborator settings**, it's possible (and likely) that the number of runs visible to an individual in an annotation queue differs from the total number of runs in the queue compared to another user's queue size. You can update these settings at any time by clicking on the pencil icon in the **Annotation Queues** section. ## Assign runs to an annotation queue To assign runs to an annotation queue, do one of the following: * Click on **Add to Annotation Queue** in top right corner of any [trace](/langsmith/observability-concepts#traces) view. You can add any intermediate [run](/langsmith/observability-concepts#runs) (span) of the trace to an annotation queue, but not the root span. Trace view with the Add to Annotation Queue button highglighted at the top of the screen. * Select multiple runs in the runs table then click **Add to Annotation Queue** at the bottom of the page. View of the runs table with runs selected. Add to Annotation Queue button at the botton of the page. * [Set up an automation rule](/langsmith/rules) that automatically assigns runs that pass a certain filter and sampling condition to an annotation queue. * Navigate to the **Datasets & Experiments** page and select a dataset. On the dataset's page select one or multiple [experiments](/langsmith/evaluation-concepts#experiment). At the bottom of the page, click ** Annotate**. From the resulting popup, you can either create a new queue or add the runs to an existing one. Selected experiments with the Annotate button at the bottom of the page. It is often a good idea to assign runs that have a particular type of user feedback score (e.g., thumbs up, thumbs down) from the application to an annotation queue. This way, you can identify and address issues that are causing user dissatisfaction. To learn more about how to capture user feedback from your LLM application, follow the guide on [attaching user feedback](/langsmith/attach-user-feedback). ## Review runs in an annotation queue To review runs in an annotation queue: 1. Navigate to the **Annotation Queues** section through the left-hand navigation bar. 2. Click on the queue you want to review. This will take you to a focused, cyclical view of the runs in the queue that require review. 3. You can attach a comment, attach a score for a particular [feedback](/langsmith/observability-concepts#feedback) criteria, add the run to a dataset or mark the run as reviewed. You can also remove the run from the queue for all users, despite any current reservations or settings for the queue, by clicking the **Trash** icon next to **View run**. The keyboard shortcuts that are next to each option can help streamline the review process. View or a run with the Annotate side panel. Keyboard shortcuts visible for options. ## Video guide