Documentation Index
Fetch the complete documentation index at: https://docs.langchain.com/llms.txt
Use this file to discover all available pages before exploring further.
Tavily is a search engine built specifically for AI agents (LLMs), delivering real-time, accurate, and factual results at speed. Tavily offers an Extract endpoint that can be used to extract the cleaned, parsed content of one or more URLs.
Overview
Integration details
| Returns artifact | Native async | Return data | Pricing |
|---|
| ❌ | ✅ | raw content and images | 1,000 free credits / month |
Setup
The integration lives in the @langchain/tavily package, which you can install as shown below:
npm install @langchain/tavily @langchain/core
Credentials
Set up a Tavily API key and set it as an environment variable named TAVILY_API_KEY.
process.env.TAVILY_API_KEY = "YOUR_API_KEY"
It’s also helpful (but not needed) to set up LangSmith for best-in-class observability:
process.env.LANGSMITH_TRACING="true"
process.env.LANGSMITH_API_KEY="your-api-key"
Instantiation
The tool accepts the following parameters during instantiation:
extractDepth (optional, string): Depth of the extraction, either "basic" or "advanced". Default is "basic".
includeImages (optional, boolean): Whether to include images in the extraction. Default is false.
format (optional, string): Content format. "markdown" or "text". Default is "markdown".
includeFavicon (optional, boolean): Include each result’s favicon URL. Default is false.
For a comprehensive overview of the available parameters, refer to the Tavily Extract API documentation.
import { TavilyExtract } from "@langchain/tavily";
const tool = new TavilyExtract({
extractDepth: "basic",
includeImages: false,
});
Invocation
The Tavily extract tool accepts the following arguments during invocation:
urls (required): A list of URLs to extract content from.
- The following arguments can also be set during invocation:
extractDepth, includeImages.
NOTE: The optional arguments are available for agents to dynamically set. If you set an argument during instantiation and then invoke the tool with a different value, the tool will use the value you passed during invocation.
await tool.invoke({ urls: ["https://en.wikipedia.org/wiki/Lionel_Messi"] });
{
results: [{
url: 'https://en.wikipedia.org/wiki/Lionel_Messi',
raw_content: 'Lionel Messi - Wikipedia\nJump to content\nMain menu\n... (truncated)',
images: []
}],
failed_results: [],
response_time: 0.02
}
We can also invoke the tool with a model-generated ToolCall, in which case a ToolMessage will be returned:
// This is usually generated by a model, but we'll create a tool call directly for demo purposes.
const modelGeneratedToolCall = {
args: { urls: ["https://en.wikipedia.org/wiki/Lionel_Messi"] },
id: "1",
name: tool.name,
type: "tool_call",
};
const toolMsg = await tool.invoke(modelGeneratedToolCall);
// The content is a JSON string of results
console.log(toolMsg.content.slice(0, 400));
{"results": [{"url": "https://en.wikipedia.org/wiki/Lionel_Messi", "raw_content": "Lionel Messi - Wikipedia\nJump to content\nMain menu\nMain menu\nmove to sidebar hide\nNavigation\n\nMain page\nContents\nCurrent events\nRandom article\nAbout Wikipedia\nContact us\n\nContribute\n\nHelp\nLearn to edit\nCommunity portal\nRecent changes\nUpload file\nSpecial pages\n... (truncated)"}], "failed_results": [], "response_time": 0.02}
Use within an agent
We can use the extract tool directly with a LangChain agent by passing it to createAgent. The agent can dynamically set the list of URLs and toggles like extractDepth and includeImages as part of its tool call.
// @lc-docs-hide-cell
import { ChatOpenAI } from "@langchain/openai";
const llm = new ChatOpenAI({
model: "gpt-5.4",
temperature: 0,
});
import { TavilyExtract } from "@langchain/tavily";
import { createAgent } from "langchain";
const tavilyExtractTool = new TavilyExtract();
const agent = createAgent({
model: llm,
tools: [tavilyExtractTool],
});
const userInput = "Summarize https://en.wikipedia.org/wiki/Albert_Einstein and https://en.wikipedia.org/wiki/Theoretical_physics.";
const events = await agent.stream(
{ messages: [["human", userInput]] },
{ streamMode: "values" },
);
for await (const event of events) {
const lastMsg = event.messages[event.messages.length - 1];
if (lastMsg.tool_calls?.length) {
console.dir(lastMsg.tool_calls, { depth: null });
} else if (lastMsg.content) {
console.log(lastMsg.content);
}
}
API reference
For detailed documentation of all Tavily Extract API features and configurations head to the API reference: docs.tavily.com/documentation/api-reference/endpoint/extract