Skip to main content
This guide provides a quick overview for getting started with the Baseten chat model. For a detailed listing of all ChatBaseten features, parameters, and configurations, head to the ChatBaseten API reference. Baseten provides inference designed for production applications. Built on the Baseten Inference Stack, these APIs deliver enterprise-grade performance and reliability for leading open-source or custom models: https://www.baseten.co/library/.

Overview

Details

ClassPackageLocalSerializableJS supportDownloadsVersion
ChatBasetenlangchain-basetenbetaPyPI - DownloadsPyPI - Version

Features

Tool callingStructured outputJSON modeImage inputAudio inputVideo inputToken-level streamingNative asyncToken usageLogprobs
Model APIs only support text input, while some dedicated deployments support image and audio input depending on model. Check the Baseten model library for details: https://www.baseten.co/library/

Setup

To access Baseten models, you’ll need to create a Baseten account, get an API key, and install the langchain-baseten integration package. Head to this page to create an account with Baseten and generate an API key. Once you’ve done this, set the BASETEN_API_KEY environment variable:

Credentials

Set API key
import getpass
import os

if "BASETEN_API_KEY" not in os.environ:
    os.environ["BASETEN_API_KEY"] = getpass.getpass("Enter your Baseten API key: ")
To enable automated of your model calls, set your LangSmith API key:
Enable tracing
os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
os.environ["LANGSMITH_TRACING"] = "true"

Installation

The LangChain Baseten integration lives in the langchain-baseten package:
pip install -U langchain-baseten

Instantiation

Baseten offers two ways to access chat models:
  1. Model APIs: For access to the latest, most popular opensource models.
  2. Dedicated URLs: Use specific model deployments with dedicated resources.
Both approaches are supported with automatic endpoint normalization.
Initialize with model slug
from langchain_baseten import ChatBaseten

# Option 1: Use Model APIs with model slug
model = ChatBaseten(
    model="moonshotai/Kimi-K2-Instruct-0905",  # Choose from available model slugs: https://docs.baseten.co/development/model-apis/overview#supported-models
    api_key="your-api-key",  # Or set BASETEN_API_KEY env var
)
Initialize with model URL
from langchain_baseten import ChatBaseten

# Option 2: Use dedicated deployments with model url
model = ChatBaseten(
    model_url="https://model-<id>.api.baseten.co/environments/production/predict",
    api_key="your-api-key",  # Or set BASETEN_API_KEY env var
)

Invocation

Basic invocation
# Use the chat model
response = model.invoke("Hello, how are you?")
print(response.content)
content="Hello! I'm doing well, thank you for asking! How about you?" additional_kwargs={} response_metadata={'finish_reason': 'stop'} id='run--908651ec-00d7-4992-a320-864397c14e37-0'
You can also use message objects for more complex conversations:
messages = [
    {"role": "system", "content": "You are a poetry expert"},
    {"role": "user", "content": "Write a haiku about spring"},
]
response = model.invoke(messages)
print(response)
content='Buds yawn open wide—  \na robin stitches the hush  \nwith threads of first light.' additional_kwargs={} response_metadata={'finish_reason': 'stop'} id='run--6f7d1db7-daae-4628-a40a-2ab7323e8f15-0'
Full guides are available on chat model invocation types, message types, and content blocks.

API reference

For detailed documentation of all ChatBaseten features and configurations, head to the API reference.
Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.
I