Skip to main content
You can find information about Anthropic’s latest models, their costs, context windows, and supported input types in the Claude docs.
API ReferenceFor detailed documentation of all features and configuration options, head to the ChatAnthropic API reference.
AWS Bedrock and Google VertexAINote that certain Anthropic models can also be accessed via AWS Bedrock and Google VertexAI. See the ChatBedrock and ChatVertexAI integrations to use Anthropic models via these services.

Overview

Integration details

ClassPackageSerializableJS/TS SupportDownloadsLatest Version
ChatAnthropiclangchain-anthropicbeta(npm)Downloads per monthPyPI - Latest version

Model features

Tool callingStructured outputJSON modeImage inputAudio inputVideo inputToken-level streamingNative asyncToken usageLogprobs

Setup

To access Anthropic (Claude) models you’ll need to install the langchain-anthropic integration package and acquire a Claude API key.

Installation

pip install -U langchain-anthropic

Credentials

Head to the Claude console to sign up and generate a Claude API key. Once you’ve done this set the ANTHROPIC_API_KEY environment variable:
import getpass
import os

if "ANTHROPIC_API_KEY" not in os.environ:
    os.environ["ANTHROPIC_API_KEY"] = getpass.getpass("Enter your Anthropic API key: ")
To enable automated tracing of your model calls, set your LangSmith API key:
os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
os.environ["LANGSMITH_TRACING"] = "true"

Instantiation

Now we can instantiate our model object and generate chat completions:
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(
    model="claude-haiku-4-5-20251001",
    temperature=0,
    max_tokens=1024,
    timeout=None,
    max_retries=2,
    # other params...
)
See the ChatAnthropic API reference for details on all available parameters.

Invocation

messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
ai_msg = model.invoke(messages)
ai_msg
AIMessage(content="J'adore la programmation.", response_metadata={'id': 'msg_018Nnu76krRPq8HvgKLW4F8T', 'model': 'claude-3-5-sonnet-20240620', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 29, 'output_tokens': 11}}, id='run-57e9295f-db8a-48dc-9619-babd2bedd891-0', usage_metadata={'input_tokens': 29, 'output_tokens': 11, 'total_tokens': 40})
print(ai_msg.text)
J'adore la programmation.
Learn more about supported invocation methods in our models guide.

Token counting

You can count tokens in messages before sending them to the model using get_num_tokens_from_messages(). This uses Anthropic’s official token counting API.
from langchain_anthropic import ChatAnthropic
from langchain.messages import HumanMessage, SystemMessage

model = ChatAnthropic(model="claude-sonnet-4-5-20250929")

messages = [
    SystemMessage(content="You are a scientist"),
    HumanMessage(content="Hello, Claude"),
]

token_count = model.get_num_tokens_from_messages(messages)
print(token_count)
14
You can also count tokens when using tools:
from langchain.tools import tool

@tool(parse_docstring=True)
def get_weather(location: str) -> str:
    """Get the current weather in a given location

    Args:
        location: The city and state, e.g. San Francisco, CA
    """
    return "Sunny"

messages = [
    HumanMessage(content="What's the weather like in San Francisco?"),
]

token_count = model.get_num_tokens_from_messages(messages, tools=[get_weather])
print(token_count)
586

Content blocks

When using tools, extended thinking, and other features, content from a single Anthropic AIMessage can either be a single string or a list of Anthropic content blocks. For example, when an Anthropic model invokes a tool, the tool invocation is part of the message content (as well as being exposed in the standardized AIMessage.tool_calls):
from langchain_anthropic import ChatAnthropic
from typing_extensions import Annotated

model = ChatAnthropic(model="claude-haiku-4-5-20251001")


def get_weather(
    location: Annotated[str, ..., "Location as city and state."]
) -> str:
    """Get the weather at a location."""
    return "It's sunny."


model_with_tools = model.bind_tools([get_weather])
response = model_with_tools.invoke("Which city is hotter today: LA or NY?")
response.content
[{'text': "I'll help you compare the temperatures of Los Angeles and New York by checking their current weather. I'll retrieve the weather for both cities.",
  'type': 'text'},
 {'id': 'toolu_01CkMaXrgmsNjTso7so94RJq',
  'input': {'location': 'Los Angeles, CA'},
  'name': 'get_weather',
  'type': 'tool_use'},
 {'id': 'toolu_01SKaTBk9wHjsBTw5mrPVSQf',
  'input': {'location': 'New York, NY'},
  'name': 'get_weather',
  'type': 'tool_use'}]
Using content_blocks will render the content in LangChain’s standard format that is consistent across other model providers. Read more about content blocks.
response.content_blocks
You can also access tool calls specifically in a standard format using the tool_calls attribute:
response.tool_calls
[{'name': 'GetWeather',
  'args': {'location': 'Los Angeles, CA'},
  'id': 'toolu_01Ddzj5PkuZkrjF4tafzu54A'},
 {'name': 'GetWeather',
  'args': {'location': 'New York, NY'},
  'id': 'toolu_012kz4qHZQqD4qg8sFPeKqpP'}]

Tools

Anthropic’s tool use features allow you to define external functions that Claude can call during a conversation. This enables dynamic information retrieval, computations, and interactions with external systems. See ChatAnthropic.bind_tools for details on how to bind tools to your model instance.
For information about Claude’s built-in tools (code execution, web browsing, files API, etc), see the Built-in tools.

Strict tool use

Strict tool use requires:
  • Claude Sonnet 4.5 or Opus 4.1.
  • langchain-anthropic>=1.1.0
Anthropic supports opt-in strict schema adherence to tool calls. This guarantees that tool names and arguments are validated and correctly typed through constrained decoding. Without strict mode, Claude can occasionally generate invalid tool inputs that break your applications:
  • Type mismatches: passengers: "2" instead of passengers: 2
  • Missing required fields: Omitting fields your function expects
  • Invalid enum values: Values outside the allowed set
  • Schema violations: Nested objects not matching expected structure
Strict tool use guarantees schema-compliant tool calls:
  • Tool inputs strictly follow your input_schema
  • Guaranteed field types and required fields
  • Eliminate error handling for malformed inputs
  • Tool name used is always from provided tools
Use strict tool useUse standard tool calling
Building agentic workflows where reliability is criticalSimple, single-turn tool calls
Tools with many parameters or nested objectsPrototyping and experimentation
Functions that require specific types (e.g., int vs str)
To enable strict tool use, specify strict=True when calling bind_tools.
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(model="claude-sonnet-4-5-20250929")

def get_weather(location: str) -> str:
    """Get the weather at a location."""
    return "It's sunny."

model_with_tools = model.bind_tools([get_weather], strict=True)  
Consider a booking system where passengers must be an integer:
from langchain_anthropic import ChatAnthropic
from typing import Literal

model = ChatAnthropic(model="claude-sonnet-4-5-20250929")

def book_flight(
    destination: str,
    departure_date: str,
    passengers: int, 
    cabin_class: Literal["economy", "business", "first"]
) -> str:
    """Book a flight to a destination.

    Args:
        destination: The destination city
        departure_date: Date in YYYY-MM-DD format
        passengers: Number of passengers (must be an integer)
        cabin_class: The cabin class for the flight
    """
    return f"Booked {passengers} passengers to {destination}"

model_with_tools = model.bind_tools(
    [book_flight],
    strict=True, 
    tool_choice="any",
)
response = model_with_tools.invoke("Book 2 passengers to Tokyo, business class, 2025-01-15")

# With strict=True, passengers is guaranteed to be int, not "2" or "two"
print(response.tool_calls[0]["args"]["passengers"])
2
Strict tool use has some JSON schema limitations to be aware of. See the Claude docs for more details. If your tool schema uses unsupported features, you’ll receive a 400 error. In these cases, simplify the schema or use standard (non-strict) tool calling.

Input examples

For complex tools, you can provide usage examples to help Claude understand how to use them correctly. This is done by setting input_examples in the tool’s extras parameter.
from langchain_anthropic import ChatAnthropic
from langchain.tools import tool

@tool(
    extras={ 
        "input_examples": [ 
            { 
                "query": "weather report", 
                "location": "San Francisco", 
                "format": "detailed"
            }, 
            { 
                "query": "temperature", 
                "location": "New York", 
                "format": "brief"
            } 
        ] 
    } 
)
def search_weather_data(query: str, location: str, format: str = "brief") -> str:
    """Search weather database with specific query and format preferences.

    Args:
        query: The type of weather information to retrieve
        location: City or region to search
        format: Output format, either 'brief' or 'detailed'
    """
    return f"{format.title()} {query} for {location}: Data found"

model = ChatAnthropic(model="claude-sonnet-4-5-20250929")
model_with_tools = model.bind_tools([search_weather_data])

response = model_with_tools.invoke(
    "Get me a detailed weather report for Seattle"
)
The extras parameter also supports:

Token-efficient tool use

Anthropic supports a token-efficient tool use feature. It is supported by default on all Claude 4 models and above.
To use token-efficient tool use with Claude 3.7, specify the token-efficient-tools-2025-02-19 beta-header when instantiating the model:
from langchain_anthropic import ChatAnthropic
from langchain.tools import tool

model = ChatAnthropic(
    model="claude-3-7-sonnet-20250219",
    betas=["token-efficient-tools-2025-02-19"],  
)


@tool
def get_weather(location: str) -> str:
    """Get the weather at a location."""
    return "It's sunny."


model_with_tools = model.bind_tools([get_weather])
response = model_with_tools.invoke("What's the weather in San Francisco?")
print(response.tool_calls)
print(f"\nTotal tokens: {response.usage_metadata['total_tokens']}")
[{'name': 'get_weather', 'args': {'location': 'San Francisco'}, 'id': 'toolu_01EoeE1qYaePcmNbUvMsWtmA', 'type': 'tool_call'}]

Total tokens: 408
Anthropic automatically caches tool descriptions to reduce token usage on subsequent calls. See Caching tools for details.

Fine-grained tool streaming

Anthropic supports fine-grained tool streaming, a beta feature that reduces latency when streaming tool calls with large parameters. Rather than buffering entire parameter values before transmission, fine-grained streaming sends parameter data as it becomes available. This can reduce the initial delay from 15 seconds to around 3 seconds for large tool parameters.
Fine-grained streaming may return invalid or partial JSON inputs, especially if the response reaches max_tokens before completing. Implement appropriate error handling for incomplete JSON data.
To enable fine-grained tool streaming, specify the fine-grained-tool-streaming-2025-05-14 beta header when initializing the model:
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(
    model="claude-sonnet-4-5-20250929",
    betas=["fine-grained-tool-streaming-2025-05-14"],  
)

def write_document(title: str, content: str) -> str:
    """Write a document with the given title and content."""
    return f"Document '{title}' written successfully"

model_with_tools = model.bind_tools([write_document])

# Stream tool calls with reduced latency
for chunk in model_with_tools.stream(
    "Write a detailed technical document about the benefits of streaming APIs"
):
    print(chunk.content)
The streaming data arrives as input_json_delta blocks in chunk.content. You can accumulate these to build the complete tool arguments:
import json

accumulated_json = ""

for chunk in model_with_tools.stream("Write a document about AI"):
    for block in chunk.content:
        if isinstance(block, dict) and block.get("type") == "input_json_delta":
            accumulated_json += block.get("partial_json", "")
            try:
                # Try to parse accumulated JSON
                parsed = json.loads(accumulated_json)
                print(f"Complete args: {parsed}")
            except json.JSONDecodeError:
                # JSON is still incomplete, continue accumulating
                pass
Complete args: {'title': 'Artificial Intelligence: An Overview', 'content': '# Artificial Intelligence: An Overview...

Multimodal

Claude supports image and PDF inputs as content blocks, both in Anthropic’s native format (see docs for vision and PDF support) as well as LangChain’s standard format.

Supported input methods

MethodImagePDF
Base64 inline data
HTTP/HTTPS URLs
Files API
The Files API can also be used to upload files to a container for use with Claude’s built-in code-execution tools. See the code execution section for details.

Image input

Provide image inputs along with text using a HumanMessage with list content format.
from langchain_anthropic import ChatAnthropic
from langchain.messages import HumanMessage

model = ChatAnthropic(model="claude-sonnet-4-5-20250929")

message = HumanMessage(
    content=[
        {"type": "text", "text": "Describe the image at the URL."},
        {
            "type": "image",
            "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
        },
    ]
)
response = model.invoke([message])

PDF input

Provide PDF file inputs along with text.
from langchain_anthropic import ChatAnthropic
from langchain.messages import HumanMessage

model = ChatAnthropic(model="claude-sonnet-4-5-20250929")

message = HumanMessage(
    content=[
        {"type": "text", "text": "Summarize this document."},
        {
            "type": "file",
            "url": "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf",
            "mime_type": "application/pdf",
        },
    ]
)
response = model.invoke([message])

Extended thinking

Some Claude models support an extended thinking feature, which will output the step-by-step reasoning process that led to its final answer. See compatible models in the Claude documentation. To use extended thinking, specify the thinking parameter when initializing ChatAnthropic. If needed, it can also be passed in as a parameter during invocation. You will need to specify a token budget to use this feature. See usage example below:
import json
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(
    model="claude-sonnet-4-5-20250929",
    max_tokens=5000,
    thinking={"type": "enabled", "budget_tokens": 2000}, 
)

response = model.invoke("What is the cube root of 50.653?")
print(json.dumps(response.content_blocks, indent=2))
[
  {
    "type": "reasoning",
    "reasoning": "To find the cube root of 50.653, I need to find the value of $x$ such that $x^3 = 50.653$.\n\nI can try to estimate this first. \n$3^3 = 27$\n$4^3 = 64$\n\nSo the cube root of 50.653 will be somewhere between 3 and 4, but closer to 4.\n\nLet me try to compute this more precisely. I can use the cube root function:\n\ncube root of 50.653 = 50.653^(1/3)\n\nLet me calculate this:\n50.653^(1/3) \u2248 3.6998\n\nLet me verify:\n3.6998^3 \u2248 50.6533\n\nThat's very close to 50.653, so I'm confident that the cube root of 50.653 is approximately 3.6998.\n\nActually, let me compute this more precisely:\n50.653^(1/3) \u2248 3.69981\n\nLet me verify once more:\n3.69981^3 \u2248 50.652998\n\nThat's extremely close to 50.653, so I'll say that the cube root of 50.653 is approximately 3.69981.",
    "extras": {"signature": "ErUBCkYIBxgCIkB0UjV..."}
  },
  {
    "type": "text"
    "text": "The cube root of 50.653 is approximately 3.6998.\n\nTo verify: 3.6998\u00b3 = 50.6530, which is very close to our original number.",
  }
]

Effort

Certain Claude models support an effort feature, which controls how many tokens Claude uses when responding. This is useful for balancing response quality against latency and cost.
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(
    model="claude-opus-4-5-20251101",
    effort="medium", 
)

response = model.invoke("Analyze the trade-offs between microservices and monolithic architectures")
Setting effort to "high" produces exactly the same behavior as omitting the parameter altogether.
See the Claude documentation for detail on when to use different effort levels and to see supported models.

Prompt caching

Anthropic supports caching of elements of your prompts, including messages, tool definitions, tool results, images and documents. This allows you to re-use large documents, instructions, few-shot documents, and other data to reduce latency and costs. To enable caching on an element of a prompt, mark its associated content block using the cache_control key. See examples below:
Only certain Claude models support prompt caching. See the Claude documentation for details.

Messages

import requests
from langchain_anthropic import ChatAnthropic


model = ChatAnthropic(model="claude-sonnet-4-5-20250929")

# Pull LangChain readme
get_response = requests.get(
    "https://raw.githubusercontent.com/langchain-ai/langchain/master/README.md"
)
readme = get_response.text

messages = [
    {
        "role": "system",
        "content": [
            {
                "type": "text",
                "text": "You are a technology expert.",
            },
            {
                "type": "text",
                "text": f"{readme}",
                "cache_control": {"type": "ephemeral"},  
            },
        ],
    },
    {
        "role": "user",
        "content": "What's LangChain, according to its README?",
    },
]

response_1 = model.invoke(messages)
response_2 = model.invoke(messages)

usage_1 = response_1.usage_metadata["input_token_details"]
usage_2 = response_2.usage_metadata["input_token_details"]

print(f"First invocation:\n{usage_1}")
print(f"\nSecond:\n{usage_2}")
First invocation:
{'cache_read': 0, 'cache_creation': 1458}

Second:
{'cache_read': 1458, 'cache_creation': 0}
Extended cachingThe cache lifetime is 5 minutes by default. If this is too short, you can apply one hour caching by enabling the "extended-cache-ttl-2025-04-11" beta header and specifying "cache_control": {"type": "ephemeral", "ttl": "1h"} on the message:
model = ChatAnthropic(
    model="claude-sonnet-4-5-20250929",
    betas=["extended-cache-ttl-2025-04-11"],  
)

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": f"{long_text}",
                "cache_control": {"type": "ephemeral", "ttl": "1h"}, 
            },
        ],
    }
]
Details of cached token counts will be included on the InputTokenDetails of response’s usage_metadata:
response = model.invoke(messages)
response.usage_metadata
{
    "input_tokens": 1500,
    "output_tokens": 200,
    "total_tokens": 1700,
    "input_token_details": {
        "cache_read": 0,
        "cache_creation": 1000,
        "ephemeral_1h_input_tokens": 750,
        "ephemeral_5m_input_tokens": 250,
    }
}

Caching tools

from langchain_anthropic import ChatAnthropic
from langchain.tools import tool


# For demonstration purposes, we artificially expand the
# tool description.
description = (
    "Get the weather at a location. "
    f"By the way, check out this readme: {readme}"
)


@tool(description=description, extras={"cache_control": {"type": "ephemeral"}})  
def get_weather(location: str) -> str:
    return "It's sunny."


model = ChatAnthropic(model="claude-sonnet-4-5-20250929")
model_with_tools = model.bind_tools([get_weather])
query = "What's the weather in San Francisco?"

response_1 = model_with_tools.invoke(query)
response_2 = model_with_tools.invoke(query)

usage_1 = response_1.usage_metadata["input_token_details"]
usage_2 = response_2.usage_metadata["input_token_details"]

print(f"First invocation:\n{usage_1}")
print(f"\nSecond:\n{usage_2}")
First invocation:
{'cache_read': 0, 'cache_creation': 1809}

Second:
{'cache_read': 1809, 'cache_creation': 0}

Incremental caching in conversational applications

Prompt caching can be used in multi-turn conversations to maintain context from earlier messages without redundant processing. We can enable incremental caching by marking the final message with cache_control. Claude will automatically use the longest previously-cached prefix for follow-up messages. Below, we implement a simple chatbot that incorporates this feature. We follow the LangChain chatbot tutorial, but add a custom reducer that automatically marks the last content block in each user message with cache_control:
import requests
from langchain_anthropic import ChatAnthropic
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import START, StateGraph, add_messages
from typing_extensions import Annotated, TypedDict


model = ChatAnthropic(model="claude-sonnet-4-5-20250929")

# Pull LangChain readme
get_response = requests.get(
    "https://raw.githubusercontent.com/langchain-ai/langchain/master/README.md"
)
readme = get_response.text


def messages_reducer(left: list, right: list) -> list:
    # Update last user message
    for i in range(len(right) - 1, -1, -1):
        if right[i].type == "human":
            right[i].content[-1]["cache_control"] = {"type": "ephemeral"}
            break

    return add_messages(left, right)


class State(TypedDict):
    messages: Annotated[list, messages_reducer]


workflow = StateGraph(state_schema=State)


# Define the function that calls the model
def call_model(state: State):
    response = model.invoke(state["messages"])
    return {"messages": [response]}


# Define the (single) node in the graph
workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

# Add memory
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)
from langchain.messages import HumanMessage

config = {"configurable": {"thread_id": "abc123"}}

query = "Hi! I'm Bob."

input_message = HumanMessage([{"type": "text", "text": query}])
output = app.invoke({"messages": [input_message]}, config)
output["messages"][-1].pretty_print()
print(f"\n{output['messages'][-1].usage_metadata['input_token_details']}")
================================== Ai Message ==================================

Hello, Bob! It's nice to meet you. How are you doing today? Is there something I can help you with?

{'cache_read': 0, 'cache_creation': 0, 'ephemeral_5m_input_tokens': 0, 'ephemeral_1h_input_tokens': 0}
query = f"Check out this readme: {readme}"

input_message = HumanMessage([{"type": "text", "text": query}])
output = app.invoke({"messages": [input_message]}, config)
output["messages"][-1].pretty_print()
print(f"\n{output['messages'][-1].usage_metadata['input_token_details']}")
================================== Ai Message ==================================

I can see you've shared the README from the LangChain GitHub repository. This is the documentation for LangChain, which is a popular framework for building applications powered by Large Language Models (LLMs). Here's a summary of what the README contains:

LangChain is:
- A framework for developing LLM-powered applications
- Helps chain together components and integrations to simplify AI application development
- Provides a standard interface for models, embeddings, vector stores, etc.

Key features/benefits:
- Real-time data augmentation (connect LLMs to diverse data sources)
- Model interoperability (swap models easily as needed)
- Large ecosystem of integrations

The LangChain ecosystem includes:
- LangSmith - For evaluations and observability
- LangGraph - For building complex agents with customizable architecture
- LangSmith - For deployment and scaling of agents

The README also mentions installation instructions (`pip install -U langchain`) and links to various resources including tutorials, how-to guides, conceptual guides, and API references.

Is there anything specific about LangChain you'd like to know more about, Bob?

{'cache_read': 0, 'cache_creation': 1846, 'ephemeral_5m_input_tokens': 1846, 'ephemeral_1h_input_tokens': 0}
query = "What was my name again?"

input_message = HumanMessage([{"type": "text", "text": query}])
output = app.invoke({"messages": [input_message]}, config)
output["messages"][-1].pretty_print()
print(f"\n{output['messages'][-1].usage_metadata['input_token_details']}")
================================== Ai Message ==================================

Your name is Bob. You introduced yourself at the beginning of our conversation.

{'cache_read': 1846, 'cache_creation': 278, 'ephemeral_5m_input_tokens': 278, 'ephemeral_1h_input_tokens': 0}
In the LangSmith trace, toggling “raw output” will show exactly what messages are sent to the chat model, including cache_control keys.

Citations

Anthropic supports a citations feature that lets Claude attach context to its answers based on source documents supplied by the user. When document or search_result content blocks with "citations": {"enabled": True} are included in a query, Claude may generate citations in its response.

Simple example

In this example we pass a plain text document. In the background, Claude automatically chunks the input text into sentences, which are used when generating citations.
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(model="claude-haiku-4-5-20251001")

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {
                    "type": "text",
                    "media_type": "text/plain",
                    "data": "The grass is green. The sky is blue.",
                },
                "title": "My Document",
                "context": "This is a trustworthy document.",
                "citations": {"enabled": True},
            },
            {"type": "text", "text": "What color is the grass and sky?"},
        ],
    }
]
response = model.invoke(messages)
response.content
[{'text': 'Based on the document, ', 'type': 'text'},
 {'text': 'the grass is green',
  'type': 'text',
  'citations': [{'type': 'char_location',
    'cited_text': 'The grass is green. ',
    'document_index': 0,
    'document_title': 'My Document',
    'start_char_index': 0,
    'end_char_index': 20}]},
 {'text': ', and ', 'type': 'text'},
 {'text': 'the sky is blue',
  'type': 'text',
  'citations': [{'type': 'char_location',
    'cited_text': 'The sky is blue.',
    'document_index': 0,
    'document_title': 'My Document',
    'start_char_index': 20,
    'end_char_index': 36}]},
 {'text': '.', 'type': 'text'}]

In tool results (agentic RAG)

Claude supports a search_result content block representing citable results from queries against a knowledge base or other custom source. These content blocks can be passed to claude both top-line (as in the above example) and within a tool result. This allows Claude to cite elements of its response using the result of a tool call. To pass search results in response to tool calls, define a tool that returns a list of search_result content blocks in Anthropic’s native format. For example:
def retrieval_tool(query: str) -> list[dict]:
    """Access my knowledge base."""

    # Run a search (e.g., with a LangChain vector store)
    results = vector_store.similarity_search(query=query, k=2)

    # Package results into search_result blocks
    return [
        {
            "type": "search_result",
            # Customize fields as desired, using document metadata or otherwise
            "title": "My Document Title",
            "source": "Source description or provenance",
            "citations": {"enabled": True},
            "content": [{"type": "text", "text": doc.page_content}],
        }
        for doc in results
    ]
Here we demonstrate an end-to-end example in which we populate a LangChain vector store with sample documents and equip Claude with a tool that queries those documents.The tool here takes a search query and a category string literal, but any valid tool signature can be used.This example requires langchain-openai and numpy to be installed:
pip install langchain-openai numpy
from typing import Literal

from langchain.chat_models import init_chat_model
from langchain.embeddings import init_embeddings
from langchain_core.documents import Document
from langchain_core.vectorstores import InMemoryVectorStore
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents import create_agent


# Set up vector store
# Ensure you set your OPENAI_API_KEY environment variable
embeddings = init_embeddings("openai:text-embedding-3-small")
vector_store = InMemoryVectorStore(embeddings)

document_1 = Document(
    id="1",
    page_content=(
        "To request vacation days, submit a leave request form through the "
        "HR portal. Approval will be sent by email."
    ),
    metadata={
        "category": "HR Policy",
        "doc_title": "Leave Policy",
        "provenance": "Leave Policy - page 1",
    },
)
document_2 = Document(
    id="2",
    page_content="Managers will review vacation requests within 3 business days.",
    metadata={
        "category": "HR Policy",
        "doc_title": "Leave Policy",
        "provenance": "Leave Policy - page 2",
    },
)
document_3 = Document(
    id="3",
    page_content=(
        "Employees with over 6 months tenure are eligible for 20 paid vacation days "
        "per year."
    ),
    metadata={
        "category": "Benefits Policy",
        "doc_title": "Benefits Guide 2025",
        "provenance": "Benefits Policy - page 1",
    },
)

documents = [document_1, document_2, document_3]
vector_store.add_documents(documents=documents)


# Define tool
async def retrieval_tool(
    query: str, category: Literal["HR Policy", "Benefits Policy"]
) -> list[dict]:
    """Access my knowledge base."""

    def _filter_function(doc: Document) -> bool:
        return doc.metadata.get("category") == category

    results = vector_store.similarity_search(
        query=query, k=2, filter=_filter_function
    )

    return [
        {
            "type": "search_result",
            "title": doc.metadata["doc_title"],
            "source": doc.metadata["provenance"],
            "citations": {"enabled": True},
            "content": [{"type": "text", "text": doc.page_content}],
        }
        for doc in results
    ]



# Create agent
model = init_chat_model("claude-haiku-4-5-20251001")

checkpointer = InMemorySaver()
agent = create_agent(model, [retrieval_tool], checkpointer=checkpointer)


# Invoke on a query
config = {"configurable": {"thread_id": "session_1"}}

input_message = {
    "role": "user",
    "content": "How do I request vacation days?",
}
async for step in agent.astream(
    {"messages": [input_message]},
    config,
    stream_mode="values",
):
    step["messages"][-1].pretty_print()

Using with text splitters

Anthropic also lets you specify your own splits using custom document types. LangChain text splitters can be used to generate meaningful splits for this purpose. See the below example, where we split the LangChain README.md (a markdown document) and pass it to Claude as context: This example requires langchain-text-splitters to be installed:
pip install langchain-text-splitters
import requests
from langchain_anthropic import ChatAnthropic
from langchain_text_splitters import MarkdownTextSplitter


def format_to_anthropic_documents(documents: list[str]):
    return {
        "type": "document",
        "source": {
            "type": "content",
            "content": [{"type": "text", "text": document} for document in documents],
        },
        "citations": {"enabled": True},
    }


# Pull readme
get_response = requests.get(
    "https://raw.githubusercontent.com/langchain-ai/langchain/master/README.md"
)
readme = get_response.text

# Split into chunks
splitter = MarkdownTextSplitter(
    chunk_overlap=0,
    chunk_size=50,
)
documents = splitter.split_text(readme)

# Construct message
message = {
    "role": "user",
    "content": [
        format_to_anthropic_documents(documents),
        {"type": "text", "text": "Give me a link to LangChain's tutorials."},
    ],
}

# Query model
model = ChatAnthropic(model="claude-haiku-4-5-20251001")
response = model.invoke([message])

Context management

Anthropic supports a context editing feature that will automatically manage the model’s context window (e.g., by clearing tool results). See the Claude documentation for more details and configuration options.
Context management is supported since langchain-anthropic>=0.3.21You must speficy the context-management-2025-06-27 beta header to apply context management to your model calls.
from langchain_anthropic import ChatAnthropic


model = ChatAnthropic(
    model="claude-sonnet-4-5-20250929",
    betas=["context-management-2025-06-27"], 
    context_management={"edits": [{"type": "clear_tool_uses_20250919"}]}, 
)
model_with_tools = model.bind_tools([{"type": "web_search_20250305", "name": "web_search"}])
response = model_with_tools.invoke("Search for recent developments in AI")

Extended context window

Claude Sonnet 4 and 4.5 support a 1-million token context window, available in beta for organizations in usage tier 4 and organizations with custom rate limits. To enable the extended context window, specify the context-1m-2025-08-07 beta header:
from langchain_anthropic import ChatAnthropic
from langchain.messages import HumanMessage

model = ChatAnthropic(
    model="claude-sonnet-4-5-20250929",
    betas=["context-1m-2025-08-07"],  
)

long_document = """
This is a very long document that would benefit from the extended 1M
context window...
[imagine this continues for hundreds of thousands of tokens]
"""

messages = [
    HumanMessage(f"""
Please analyze this document and provide a summary:

{long_document}

What are the key themes and main conclusions?
""")
]

response = model.invoke(messages)
See the Claude documentation for detail.

Structured output

Structured output requires:
  • Claude Sonnet 4.5 or Opus 4.1.
  • langchain-anthropic>=1.1.0
Anthropic supports a native structured output feature, which guarantees that its responses adhere to a given schema. You can access this feature in individual model calls, or by specifying the response format of a LangChain agent. See below for examples.
Use the with_structured_output method to generate a structured model response. Specify method="json_schema" to enable Anthropic’s native structured output feature; otherwise the method defaults to using function calling.
from langchain_anthropic import ChatAnthropic
from pydantic import BaseModel, Field

model = ChatAnthropic(model="claude-sonnet-4-5-20250929")

class Movie(BaseModel):
    """A movie with details."""
    title: str = Field(..., description="The title of the movie")
    year: int = Field(..., description="The year the movie was released")
    director: str = Field(..., description="The director of the movie")
    rating: float = Field(..., description="The movie's rating out of 10")

model_with_structure = model.with_structured_output(Movie, method="json_schema")  
response = model_with_structure.invoke("Provide details about the movie Inception")
response
Movie(title='Inception', year=2010, director='Christopher Nolan', rating=8.8)
Specify response_format with ProviderStrategy to engage Anthropic’s structured output feature when generating its final response.
from langchain.agents import create_agent
from langchain.agents.structured_output import ProviderStrategy
from pydantic import BaseModel

class Weather(BaseModel):
    temperature: float
    condition: str

def weather_tool(location: str) -> str:
    """Get the weather at a location."""
    return "Sunny and 75 degrees F."

agent = create_agent(
    model="anthropic:claude-sonnet-4-5",
    tools=[weather_tool],
    response_format=ProviderStrategy(Weather),  
)

result = agent.invoke({
    "messages": [{"role": "user", "content": "What's the weather in SF?"}]
})

result["structured_response"]
Weather(temperature=75.0, condition='Sunny')

Built-in tools

Anthropic supports a variety of built-in tools, which can be bound to the model in the usual way. Claude will generate tool calls adhering to its internal schema for the tool.

Bash tool

Claude supports a bash tool that allows it to execute shell commands in a persistent bash session. This enables system operations, script execution, and command-line automation.
Important: You must provide the execution environmentLangChain handles the API integration (sending/receiving tool calls), but you are responsible for:
  • Setting up a sandboxed computing environment (Docker, VM, etc.)
  • Implementing command execution and output capture
  • Passing results back to Claude in an agent loop
See the Claude bash tool docs for implementation guidance.
Requirements:
  • Claude 4 models or Claude Sonnet 3.7
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(model="claude-sonnet-4-5-20250929")

bash_tool = {
    "type": "bash_20250124",
    "name": "bash",
}

model_with_bash = model.bind_tools([bash_tool])
response = model_with_bash.invoke(
    "List all Python files in the current directory"
)
response.tool_calls will contain the bash command Claude wants to execute. You must run this command in your environment and pass the result back.
[{'type': 'text',
  'text': "I'll list the Python files in the current directory for you."},
 {'type': 'tool_call',
  'name': 'bash',
  'args': {'command': 'ls -la *.py'},
  'id': 'toolu_01ABC123...'}]
The bash tool supports two parameters:
  • command (required): The bash command to execute
  • restart (optional): Set to true to restart the bash session

Code execution

Claude can use a code execution tool to execute code in a sandboxed environment.
Anthropic’s 2025-08-25 code execution tools are supported since langchain-anthropic>=1.0.3.The legacy 2025-05-22 tool is supported since langchain-anthropic>=0.3.14.
The code sandbox does not have internet access, thus you may only use packages that are pre-installed in the environment. See the Claude docs for more info.
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(
    model="claude-sonnet-4-5-20250929",
)

tool = {"type": "code_execution_20250825", "name": "code_execution"} 
model_with_tools = model.bind_tools([tool])

response = model_with_tools.invoke(
    "Calculate the mean and standard deviation of [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]"
)
Using the Files API, Claude can write code to access files for data analysis and other purposes. See example below:
import anthropic
from langchain_anthropic import ChatAnthropic


client = anthropic.Anthropic()
file = client.beta.files.upload(
    file=open("/path/to/sample_data.csv", "rb")
)
file_id = file.id


# Run inference
model = ChatAnthropic(
    model="claude-sonnet-4-5-20250929",
)

tool = {"type": "code_execution_20250825", "name": "code_execution"} 
model_with_tools = model.bind_tools([tool])

input_message = {
    "role": "user",
    "content": [
        {
            "type": "text",
            "text": "Please plot these data and tell me what you see.",
        },
        {
            "type": "container_upload",
            "file_id": file_id,
        },
    ]
}
response = model_with_tools.invoke([input_message])
Note that Claude may generate files as part of its code execution. You can access these files using the Files API:
# Take all file outputs for demonstration purposes
file_ids = []
for block in response.content:
    if block["type"] == "bash_code_execution_tool_result":
        file_ids.extend(
            content["file_id"]
            for content in block.get("content", {}).get("content", [])
            if "file_id" in content
        )

for i, file_id in enumerate(file_ids):
    file_content = client.beta.files.download(file_id)
    file_content.write_to_file(f"/path/to/file_{i}.png")
Available tool versions:
  • code_execution_20250522 (legacy)
  • code_execution_20250825 (recommended)

Computer use

Claude supports computer use capabilities, allowing it to interact with desktop environments through screenshots, mouse control, and keyboard input.
Important: You must provide the execution environmentLangChain handles the API integration (sending/receiving tool calls), but you are responsible for:
  • Setting up a sandboxed computing environment (Linux VM, Docker container, etc.)
  • Implementing a virtual display (e.g., Xvfb)
  • Executing Claude’s tool calls (screenshot, mouse clicks, keyboard input)
  • Passing results back to Claude in an agent loop
Anthropic provides a reference implementation to help you get started.
Requirements:
  • Claude Opus 4.5, Claude 4, or Claude Sonnet 3.7
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(model="claude-sonnet-4-5-20250929")

# LangChain handles the API call and tool binding
computer_tool = {
    "type": "computer_20250124",
    "name": "computer",
    "display_width_px": 1024,
    "display_height_px": 768,
    "display_number": 1,
}

model_with_computer = model.bind_tools([computer_tool])
response = model_with_computer.invoke(
    "Take a screenshot to see what's on the screen"
)
response.tool_calls will contain the computer action Claude wants to perform. You must execute this action in your environment and pass the result back.
[{'type': 'text',
  'text': "I'll take a screenshot to see what's currently on the screen."},
 {'type': 'tool_call',
  'name': 'computer',
  'args': {'action': 'screenshot'},
  'id': 'toolu_01RNsqAE7dDZujELtacNeYv9'}]
Available tool versions:
  • computer_20250124 (for Claude 4 and Claude Sonnet 3.7)
  • computer_20251124 (for Claude Opus 4.5)

Remote MCP

Claude can use a MCP connector tool for model-generated calls to remote MCP servers.
Remote MCP is supported since langchain-anthropic>=0.3.14
from langchain_anthropic import ChatAnthropic

mcp_servers = [
    {
        "type": "url",
        "url": "https://docs.langchain.com/mcp",
        "name": "LangChain Docs",
        # "tool_configuration": {  # optional configuration
        #     "enabled": True,
        #     "allowed_tools": ["ask_question"],
        # },
        # "authorization_token": "PLACEHOLDER",  # optional authorization if needed
    }
]

model = ChatAnthropic(
    model="claude-sonnet-4-5-20250929",
    mcp_servers=mcp_servers, 
)

response = model.invoke(
    "What are LangChain content blocks?",
    tools=[{"type": "mcp_toolset", "mcp_server_name": "LangChain Docs"}], 
)
response.content_blocks

Text editor

The text editor tool can be used to view and modify text files. See docs here for details.
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(model="claude-sonnet-4-5-20250929")

tool = {"type": "text_editor_20250728", "name": "str_replace_based_edit_tool"}
model_with_tools = model.bind_tools([tool])

response = model_with_tools.invoke(
    "There's a syntax error in my primes.py file. Can you help me fix it?"
)
print(response.text)
response.tool_calls
I'll help you fix the syntax error in your primes.py file. Let me first take a look at the file to identify the issue.
[{'name': 'str_replace_based_edit_tool',
  'args': {'command': 'view', 'path': '/root'},
  'id': 'toolu_011BG5RbqnfBYkD8qQonS9k9',
  'type': 'tool_call'}]
Available tool versions:
  • text_editor_20250124 (legacy)
  • text_editor_20250728 (recommended)

Web fetching

Claude can use a web fetching tool to retrieve full content from specified web pages and PDF documents and ground its responses with citations.
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(model="claude-haiku-4-5-20251001")

tool = {"type": "web_fetch_20250910", "name": "web_fetch", "max_uses": 3} 
model_with_tools = model.bind_tools([tool])

response = model_with_tools.invoke(
    "Please analyze the content at https://docs.langchain.com/"
)
Claude can use a web search tool to run searches and ground its responses with citations.
Web search tool is supported since langchain-anthropic>=0.3.13
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(model="claude-sonnet-4-5-20250929")

tool = {"type": "web_search_20250305", "name": "web_search", "max_uses": 3} 
model_with_tools = model.bind_tools([tool])

response = model_with_tools.invoke("How do I update a web app to TypeScript 5.5?")

Memory tool

Claude supports a memory tool for client-side storage and retrieval of context across conversational threads. See docs here for details.
Anthropic’s built-in memory tool is supported since langchain-anthropic>=0.3.21
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(
    model="claude-sonnet-4-5-20250929",
)
model_with_tools = model.bind_tools([{"type": "memory_20250818", "name": "memory"}]) 

response = model_with_tools.invoke("What are my interests?")
response.content_blocks
[{'type': 'text',
  'text': "I'll check my memory to see what information I have about your interests."},
 {'type': 'tool_call',
  'name': 'memory',
  'args': {'command': 'view', 'path': '/memories'},
  'id': 'toolu_01XeP9sxx44rcZHFNqXSaKqh'}]
Claude supports a tool search feature that enables dynamic tool discovery and loading. Instead of loading all tool definitions into the context window upfront, Claude can search your tool catalog and load only the tools it needs. This is useful when:
  • You have 10+ tools available in your system
  • Tool definitions are consuming significant tokens
  • You’re experiencing tool selection accuracy issues with large tool sets
There are two tool search variants:
  • Regex (tool_search_tool_regex_20251119): Claude constructs regex patterns to search for tools
  • BM25 (tool_search_tool_bm25_20251119): Claude uses natural language queries to search for tools
Use the extras parameter to specify defer_loading on LangChain tools:
from langchain_anthropic import ChatAnthropic
from langchain.tools import tool

@tool(extras={"defer_loading": True})  
def get_weather(location: str, unit: str = "fahrenheit") -> str:
    """Get the current weather for a location.

    Args:
        location: City name
        unit: Temperature unit (celsius or fahrenheit)
    """
    return f"Weather in {location}: Sunny"

@tool(extras={"defer_loading": True})  
def search_files(query: str) -> str:
    """Search through files in the workspace.

    Args:
        query: Search query
    """
    return f"Found files matching '{query}'"

model = ChatAnthropic(model="claude-sonnet-4-5-20250929")

model_with_tools = model.bind_tools([
    {"type": "tool_search_tool_regex_20251119", "name": "tool_search_tool_regex"},
    get_weather,
    search_files,
])
response = model_with_tools.invoke("What's the weather in San Francisco?")
Key points:
  • Tools with defer_loading: True are only loaded when Claude discovers them via search
  • Keep your 3-5 most frequently used tools as non-deferred for optimal performance
  • Both variants search tool names, descriptions, argument names, and argument descriptions
See the Claude documentation for more details on tool search, including usage with MCP servers and client-side implementations.

API reference

For detailed documentation of all features and configuration options, head to the ChatAnthropic API reference.
Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.