Skip to main content
This page will help you get started with Crusoe AI chat models. For detailed documentation of all ChatCrusoe features and configurations, head to the Crusoe managed inference docs. Crusoe AI provides high-performance managed inference for leading open-source models via the Crusoe Intelligence Foundry, powered by proprietary MemoryAlloy™ technology for ultra-low latency and high throughput.

Overview

Integration details

ClassPackageSerializableJS supportDownloadsVersion
ChatCrusoelangchain-crusoebetaPyPI - DownloadsPyPI - Version

Model features

Tool callingStructured outputImage inputAudio inputVideo inputToken-level streamingNative asyncToken usageLogprobs

Setup

To access Crusoe models you’ll need to create a Crusoe Cloud account, get an Inference API key, and install the langchain-crusoe integration package.

Credentials

Head to the Crusoe Cloud Console to sign up. Then navigate to the Security tab and select Inference API Key to generate your key. Once you’ve done this, set the CRUSOE_API_KEY environment variable:
import getpass
import os

if "CRUSOE_API_KEY" not in os.environ:
    os.environ["CRUSOE_API_KEY"] = getpass.getpass("Enter your Crusoe API key: ")
To enable automated tracing of your model calls, set your LangSmith API key:
os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
os.environ["LANGSMITH_TRACING"] = "true"

Installation

The LangChain Crusoe integration is included in the langchain-crusoe package:
pip install -qU langchain-crusoe

Instantiation

Now we can instantiate our model object and generate chat completions:
from langchain_crusoe import ChatCrusoe

llm = ChatCrusoe(
    model="meta-llama/Llama-3.3-70B-Instruct",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
    # api_key="...",  # if not set via CRUSOE_API_KEY env var
    # other params...
)

Available models

Crusoe serves leading open-source models via the Intelligence Foundry. See the full model list for the latest availability.
ModelProviderContext Length
meta-llama/Llama-3.3-70B-InstructMeta128k
openai/gpt-oss-120bOpenAI128k
deepseek-ai/DeepSeek-V3-0324DeepSeek160k
deepseek-ai/DeepSeek-R1-0528DeepSeek160k
deepseek-ai/DeepSeek-V3.1DeepSeek160k
Qwen/Qwen3-235B-A22BQwen131k
google/gemma-3-12b-itGoogle128k
moonshotai/Kimi-K2-ThinkingMoonshot AI131k

Invocation

messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg
AIMessage(content="J'adore la programmation.", response_metadata={'token_usage': {'completion_tokens': 9, 'prompt_tokens': 35, 'total_tokens': 44}, 'model_name': 'meta-llama/Llama-3.3-70B-Instruct', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-...', usage_metadata={'input_tokens': 35, 'output_tokens': 9, 'total_tokens': 44})
print(ai_msg.content)
J'adore la programmation.

Chaining

We can chain our model with a prompt template like so:
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant that translates {input_language} to {output_language}.",
        ),
        ("human", "{input}"),
    ]
)

chain = prompt | llm
chain.invoke(
    {
        "input_language": "English",
        "output_language": "German",
        "input": "I love programming.",
    }
)
AIMessage(content='Ich liebe Programmieren.', response_metadata={...}, id='run-...')

Streaming

for chunk in llm.stream(messages):
    print(chunk.content, end="", flush=True)

Tool calling

from pydantic import BaseModel, Field

class GetWeather(BaseModel):
    """Get the current weather in a given location."""
    location: str = Field(description="City and state, e.g. San Francisco, CA")

llm_with_tools = llm.bind_tools([GetWeather])
ai_msg = llm_with_tools.invoke("What's the weather like in San Francisco?")
print(ai_msg.tool_calls)

Structured output

from pydantic import BaseModel, Field
from typing import Optional

class Joke(BaseModel):
    """Joke to tell user."""
    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline to the joke")
    rating: Optional[int] = Field(description="How funny the joke is, from 1 to 10")

structured_llm = llm.with_structured_output(Joke)
structured_llm.invoke("Tell me a joke about cats")
Joke(setup='Why was the cat sitting on the computer?', punchline='To keep an eye on the mouse!', rating=7)

API reference

For detailed documentation of all ChatCrusoe features and configurations, head to the Crusoe managed inference docs.