Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.langchain.com/llms.txt

Use this file to discover all available pages before exploring further.

Parallel is a real-time web search and content extraction platform built for LLMs and AI applications.
ParallelExtractTool calls Parallel’s Extract API, which returns clean, markdown-formatted content from web pages, with optional focused excerpts driven by a search_objective. Pair it with ParallelSearchTool to build a search → extract pipeline.

Overview

Integration details

ClassPackageSerializableJS supportPackage latest
ParallelExtractToollangchain-parallelPyPI - Latest version

Setup

The integration lives in the langchain-parallel package.
pip install -U langchain-parallel

Credentials

Head to Parallel to sign up and generate an API key. Set PARALLEL_API_KEY in your environment:
import getpass
import os

if not os.environ.get("PARALLEL_API_KEY"):
    os.environ["PARALLEL_API_KEY"] = getpass.getpass("Parallel API key:\n")

Instantiation

from langchain_parallel import ParallelExtractTool

tool = ParallelExtractTool()

# Or pass an explicit key, override the base URL, or cap per-URL `full_content` size:
# tool = ParallelExtractTool(
#     api_key="your-api-key",
#     base_url="https://api.parallel.ai",
#     max_chars_per_extract=5000,
# )

Invocation

Invoke directly with args

result = tool.invoke(
    {"urls": ["https://en.wikipedia.org/wiki/Artificial_intelligence"]}
)

print(result[0]["title"])
print(result[0]["url"])
print(result[0]["content"][:200], "...")
Multiple URLs in a single call:
result = tool.invoke(
    {
        "urls": [
            "https://en.wikipedia.org/wiki/Machine_learning",
            "https://en.wikipedia.org/wiki/Deep_learning",
            "https://en.wikipedia.org/wiki/Natural_language_processing",
        ]
    }
)

for item in result:
    print(item["title"], "—", item["url"])

Invoke with a ToolCall

Invoking with a model-generated ToolCall returns a ToolMessage:
model_generated_tool_call = {
    "args": {
        "urls": [
            "https://en.wikipedia.org/wiki/Climate_change",
            "https://en.wikipedia.org/wiki/Renewable_energy",
        ]
    },
    "id": "call_123",
    "name": tool.name,  # "parallel_extract"
    "type": "tool_call",
}

result = tool.invoke(model_generated_tool_call)

Async usage

async def extract_async():
    return await tool.ainvoke(
        {
            "urls": [
                "https://en.wikipedia.org/wiki/Python_(programming_language)",
                "https://en.wikipedia.org/wiki/JavaScript",
            ]
        }
    )

result = await extract_async()

Focused excerpts

Drive excerpt selection with a search_objective (or search_queries). Setting full_content=False skips the full markdown body and returns only matched excerpts:
result = tool.invoke(
    {
        "urls": ["https://en.wikipedia.org/wiki/Artificial_intelligence"],
        "search_objective": "What are the main applications and ethical concerns of AI?",
        "excerpts": {"max_chars_per_result": 2000},
        "full_content": False,
    }
)

Fetch policy and full-content sizing

Control caching, timeouts, and the per-URL full_content cap independently:
result = tool.invoke(
    {
        "urls": ["https://en.wikipedia.org/wiki/Quantum_computing"],
        "fetch_policy": {
            "max_age_seconds": 86400,
            "timeout_seconds": 60,
            "disable_cache_fallback": False,
        },
        "full_content": {"max_chars_per_result": 5000},
    }
)
full_content precedence. An explicit FullContentSettings (or dict) on the call always wins over the tool-level max_chars_per_extract. The latter only applies when you pass full_content=True as a plain bool.

Per-URL error handling

Failed URLs are returned as items with error_type set, so partial-success is the default:
result = tool.invoke(
    {
        "urls": [
            "https://en.wikipedia.org/wiki/Artificial_intelligence",
            "https://this-domain-does-not-exist-12345.com/",
        ]
    }
)

for item in result:
    if "error_type" in item:
        print("failed:", item["url"], "—", item["content"])
    else:
        print("ok:", item["url"], f"({len(item['content'])} chars)")

Parameters

Required

  • urls: list of URLs to extract.

Optional

  • search_objective: natural-language description that drives excerpt selection.
  • search_queries: list of keyword strings used together with (or in place of) search_objective.
  • excerpts: per-result excerpt settings. Pass ExcerptSettings(max_chars_per_result=…) (or a dict) to control per-result excerpt size; omit for the API default.
  • full_content: True to return full markdown content (sized by the tool-level max_chars_per_extract), False to skip it, or FullContentSettings(max_chars_per_result=…) for fine-grained control.
  • fetch_policy: cache control, e.g. {"max_age_seconds": 86400, "timeout_seconds": 60}.
  • max_chars_total: cap on combined output length across all URLs.
  • client_model / session_id: forwarded to Parallel for downstream attribution.

Chaining

Bind the tool to any tool-calling chat model and drive an agent with create_agent:
from langchain.agents import create_agent
from langchain.chat_models import init_chat_model

llm = init_chat_model(model="claude-opus-4-7")
agent = create_agent(model=llm, tools=[tool])

agent.invoke({"messages": [("human", "Summarize https://en.wikipedia.org/wiki/Quantum_computing")]})

Search → extract

Hand ParallelSearchTool and ParallelExtractTool to the same agent. The model uses search to find URLs and extract to drill into the ones it picks.
from langchain_parallel import ParallelSearchTool

search = ParallelSearchTool()
extract = ParallelExtractTool()
agent = create_agent(model=llm, tools=[search, extract])

agent.invoke({
    "messages": [
        ("human", "Find a recent peer-reviewed paper on net-energy-gain fusion and summarize it."),
    ]
})

Response format

[
    {
        "url": "https://example.com/article",
        "title": "Article Title",
        "content": "# Article Title\n\nMain content formatted as markdown...",
        "publish_date": "2026-01-15",
        "excerpts": ["...", "..."],  # if excerpts/search_objective requested
    },
    # Failed extractions:
    {
        "url": "https://failed-site.com",
        "title": None,
        "content": "Error: 404 Not Found",
        "error_type": "http_error",
    },
]

API reference

For detailed documentation, head to the ParallelExtractTool API reference or the Parallel Extract reference.