ChatCrusoe integration

This page will help you get started with Crusoe AI chat models. For detailed documentation of all ChatCrusoe features and configurations, head to the Crusoe managed inference docs. Crusoe AI provides high-performance managed inference for leading open-source models via the Crusoe Intelligence Foundry, powered by proprietary MemoryAlloy™ technology for ultra-low latency and high throughput.

Overview

Integration details

Class	Package	Serializable	JS support	Downloads	Version
ChatCrusoe	langchain-crusoe	beta	❌

Model features

Tool calling	Structured output	Image input	Audio input	Video input	Token-level streaming	Native async	Token usage	Logprobs
✅	✅	❌	❌	❌	✅	✅	✅	❌

Setup

To access Crusoe models you’ll need to create a Crusoe Cloud account, get an Inference API key, and install the langchain-crusoe integration package.

Credentials

Head to the Crusoe Cloud Console to sign up. Then navigate to the Security tab and select Inference API Key to generate your key. Once you’ve done this, set the CRUSOE_API_KEY environment variable:

import getpass
import os

if "CRUSOE_API_KEY" not in os.environ:
    os.environ["CRUSOE_API_KEY"] = getpass.getpass("Enter your Crusoe API key: ")

To enable automated tracing of your model calls, set your LangSmith API key:

os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
os.environ["LANGSMITH_TRACING"] = "true"

Installation

The LangChain Crusoe integration is included in the langchain-crusoe package:

pip install -qU langchain-crusoe

uv add langchain-crusoe

Instantiation

Now we can instantiate our model object and generate chat completions:

from langchain_crusoe import ChatCrusoe

llm = ChatCrusoe(
    model="meta-llama/Llama-3.3-70B-Instruct",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
    # api_key="...",  # if not set via CRUSOE_API_KEY env var
    # other params...
)

Available models

Crusoe serves leading open-source models via the Intelligence Foundry. See the full model list for the latest availability.

Model	Provider	Context Length
`meta-llama/Llama-3.3-70B-Instruct`	Meta	128k
`openai/gpt-oss-120b`	OpenAI	128k
`deepseek-ai/DeepSeek-V3-0324`	DeepSeek	160k
`deepseek-ai/DeepSeek-R1-0528`	DeepSeek	160k
`deepseek-ai/DeepSeek-V3.1`	DeepSeek	160k
`Qwen/Qwen3-235B-A22B`	Qwen	131k
`google/gemma-3-12b-it`	Google	128k
`moonshotai/Kimi-K2-Thinking`	Moonshot AI	131k

Invocation

messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg

AIMessage(content="J'adore la programmation.", response_metadata={'token_usage': {'completion_tokens': 9, 'prompt_tokens': 35, 'total_tokens': 44}, 'model_name': 'meta-llama/Llama-3.3-70B-Instruct', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-...', usage_metadata={'input_tokens': 35, 'output_tokens': 9, 'total_tokens': 44})

print(ai_msg.content)

J'adore la programmation.

Chaining

We can chain our model with a prompt template like so:

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant that translates {input_language} to {output_language}.",
        ),
        ("human", "{input}"),
    ]
)

chain = prompt | llm
chain.invoke(
    {
        "input_language": "English",
        "output_language": "German",
        "input": "I love programming.",
    }
)

AIMessage(content='Ich liebe Programmieren.', response_metadata={...}, id='run-...')

Streaming

stream = llm.stream_events(messages, version="v3")
for token in stream.text:
    print(token, end="", flush=True)

Tool calling

from pydantic import BaseModel, Field

class GetWeather(BaseModel):
    """Get the current weather in a given location."""
    location: str = Field(description="City and state, e.g. San Francisco, CA")

llm_with_tools = llm.bind_tools([GetWeather])
ai_msg = llm_with_tools.invoke("What's the weather like in San Francisco?")
print(ai_msg.tool_calls)

Structured output

from pydantic import BaseModel, Field
from typing import Optional

class Joke(BaseModel):
    """Joke to tell user."""
    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline to the joke")
    rating: Optional[int] = Field(description="How funny the joke is, from 1 to 10")

structured_llm = llm.with_structured_output(Joke)
structured_llm.invoke("Tell me a joke about cats")

Joke(setup='Why was the cat sitting on the computer?', punchline='To keep an eye on the mouse!', rating=7)

API reference

For detailed documentation of all ChatCrusoe features and configurations, head to the Crusoe managed inference docs.

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

Edit this page on GitHub or file an issue.

Popular Providers

Integrations by component

Overview

Integration details

Model features

Setup

Credentials

Installation

Instantiation

Available models

Invocation

Chaining

Streaming

Tool calling

Structured output

API reference

​Overview

​Integration details

​Model features

​Setup

​Credentials

​Installation

​Instantiation

​Available models

​Invocation

​Chaining

​Streaming

​Tool calling

​Structured output

​API reference

Overview

Integration details

Model features

Setup

Credentials

Installation

Instantiation

Available models

Invocation

Chaining

Streaming

Tool calling

Structured output

API reference