Nimble’s Extract API extracts rendered content from specific URLs by browsing them with headless browsers. Unlike search APIs that discover content, the Extract tool handles known URLs—perfect for agent workflows that need to fetch and process specific web pages, including content behind pagination, filters, and client-side rendering.
Overview
Integration details
| Class | Package | Serializable | JS support | Package latest |
|---|---|---|---|---|
| NimbleExtractTool | langchain-nimble | ❌ | ❌ |
Tool features
| Returns artifact | Native async | Return data | Pricing |
|---|---|---|---|
| ❌ | ✅ | title, URL, content (markdown/plain_text/HTML), metadata | Free trial available |
- URL extraction: Extract rendered content from 1-20 URLs in parallel
- Dynamic rendering: Handles JavaScript, lazy loading, and client-side rendering
- Multiple formats: plain_text (default), markdown, or simplified_html
- Configurable wait times: Control page load behavior for slow-loading content
- Browser drivers: Choose from vx6, vx8, or vx10 drivers for different rendering needs
- Production-ready: Native async support, automatic retries, connection pooling
Setup
The integration lives in thelangchain-nimble package.
Credentials
You’ll need a Nimble API key to use this tool. Sign up at Nimble to get your API key and access their free trial.Instantiation
Now we can instantiate the tool:Use within an agent
We can use the Nimble extract tool with an agent to give it URL content extraction capabilities. Here’s a complete example using LangGraph:Advanced configuration
The tool supports extensive configuration for URL extraction:| Parameter | Type | Default | Description |
|---|---|---|---|
links | list[str] | None | URLs to extract (1-20) - provided by agent at runtime |
parsing_type | str | ”plain_text” | Output format: “plain_text”, “markdown”, or “simplified_html” |
driver | str | ”vx6” | Browser driver version: “vx6” (fast), “vx8” (balanced), or “vx10” (comprehensive) |
wait | int | None | Milliseconds to wait for page load (0-60000) |
render | bool | True | Enable JavaScript rendering |
locale | str | ”en” | Page locale preference (e.g., “en-US”) |
country | str | ”US” | Country code for localized content (e.g., “US”) |
api_key | str | env var | Nimble API key (defaults to NIMBLE_API_KEY environment variable) |
Best Practices
Driver selection
- vx6 (default): Fast extraction for standard websites
- vx8: Balanced performance for moderately complex sites
- vx10: Comprehensive rendering for JavaScript-heavy SPAs and complex dynamic content
When to use wait times
- No wait (
wait=None): Best for most modern websites with fast initial renders - Short wait (
wait=1000-2000): For sites with lazy loading or dynamic content - Longer wait (
wait=5000+): For slow-loading pages or complex SPA applications that need time to fully render
URL management
- Batch extraction: Provide 1-20 URLs per call to extract in parallel
- Error handling: Failed URLs will be reported in agent error handling
- Content validation: Agent should validate extracted content before processing
Performance optimization
- Choose appropriate formats: Use plain_text for speed, markdown for structure, HTML for detailed styling
- Tune wait times: Only use wait times when necessary to balance speed and reliability
- Batch related URLs: Extract multiple URLs from same domain in parallel for efficiency
- Use async: Call
ainvoke()when extracting many URLs concurrently
API reference
For detailed documentation of allNimbleExtractTool features and configurations, visit the Nimble API documentation.
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.