Skip to main content
This guide provides a quick overview for getting started with the Baseten chat model. Baseten provides inference designed for production applications. Built on the Baseten Inference Stack, these APIs deliver enterprise-grade performance and reliability for leading open-source or custom models: https://www.baseten.co/library/.

Overview

Details

ClassPackageSerializableJS supportDownloadsVersion
ChatBasetenlangchain-basetenbetaPyPI - DownloadsPyPI - Version

Features

Tool callingStructured outputImage inputAudio inputVideo inputToken-level streamingNative asyncToken usageLogprobs
Model APIs only support text input, while some dedicated deployments support image and audio input depending on model. Check the Baseten model library for details: https://www.baseten.co/library/

Setup

To access Baseten models, you’ll need to create a Baseten account, get an API key, and install the langchain-baseten integration package. Head to the Baseten website to create an account and generate an API key. Once you’ve done this, set the BASETEN_API_KEY environment variable:

Credentials

Set API key
import getpass
import os

if "BASETEN_API_KEY" not in os.environ:
    os.environ["BASETEN_API_KEY"] = getpass.getpass("Enter your Baseten API key: ")
To enable automated of your model calls, set your LangSmith API key:
Enable tracing
os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
os.environ["LANGSMITH_TRACING"] = "true"

Installation

The LangChain Baseten integration lives in the langchain-baseten package:
pip install -U langchain-baseten

Instantiation

Baseten offers two ways to access chat models:
  1. Model APIs: For access to the latest, most popular opensource models.
  2. Dedicated URLs: Use specific model deployments with dedicated resources.
Both approaches are supported with automatic endpoint normalization.
Initialize with model slug
from langchain_baseten import ChatBaseten

# Option 1: Use Model APIs with model slug
model = ChatBaseten(
    model="moonshotai/Kimi-K2-Instruct-0905",  # Choose from available model slugs: https://docs.baseten.co/development/model-apis/overview#supported-models
    api_key="your-api-key",  # Or set BASETEN_API_KEY env var
)
Initialize with model URL
from langchain_baseten import ChatBaseten

# Option 2: Use dedicated deployments with model url
model = ChatBaseten(
    model_url="https://model-<id>.api.baseten.co/environments/production/predict",
    api_key="your-api-key",  # Or set BASETEN_API_KEY env var
)

Invocation

Basic invocation
# Use the chat model
response = model.invoke("Hello, how are you?")
print(response.content)
Hello! I'm doing well, thank you for asking! How about you?
You can also use message objects for more complex conversations:
messages = [
    {"role": "system", "content": "You are a poetry expert"},
    {"role": "user", "content": "Write a haiku about spring"},
]
response = model.invoke(messages)
print(response)
content='Buds yawn open wide—\na robin stitches the hush\nwith threads of first light.' additional_kwargs={} response_metadata={'token_usage': {'completion_tokens': 26, 'prompt_tokens': 14, 'total_tokens': 40}, 'model_name': 'moonshotai/Kimi-K2-Instruct-0905', 'finish_reason': 'stop', 'model_provider': 'baseten'} id='run--6f7d1db7-daae-4628-a40a-2ab7323e8f15-0'
Full guides are available on chat model invocation types, message types, and content blocks.