ChatBaseten

This guide provides a quick overview for getting started with the Baseten chat model. For a detailed listing of all ChatBaseten features, parameters, and configurations, head to the ChatBaseten API reference. Baseten provides inference designed for production applications. Built on the Baseten Inference Stack, these APIs deliver enterprise-grade performance and reliability for leading open-source or custom models: https://www.baseten.co/library/.

Overview

Details

Class	Package	Local	Serializable	JS support	Downloads	Version
ChatBaseten	langchain-baseten	❌	beta	❌

Features

Tool calling	Structured output	JSON mode	Image input	Audio input	Video input	Token-level streaming	Native async	Token usage	Logprobs
✅	✅	✅	✅	✅	❌	✅	✅	✅	❌

Model APIs only support text input, while some dedicated deployments support image and audio input depending on model. Check the Baseten model library for details: https://www.baseten.co/library/

Setup

To access Baseten models, you’ll need to create a Baseten account, get an API key, and install the langchain-baseten integration package. Head to this page to create an account with Baseten and generate an API key. Once you’ve done this, set the BASETEN_API_KEY environment variable:

Credentials

Set API key

import getpass
import os

if "BASETEN_API_KEY" not in os.environ:
    os.environ["BASETEN_API_KEY"] = getpass.getpass("Enter your Baseten API key: ")

To enable automated of your model calls, set your LangSmith API key:

Enable tracing

os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
os.environ["LANGSMITH_TRACING"] = "true"

Installation

The LangChain Baseten integration lives in the langchain-baseten package:

pip install -U langchain-baseten

Instantiation

Baseten offers two ways to access chat models:

Model APIs: For access to the latest, most popular opensource models.
Dedicated URLs: Use specific model deployments with dedicated resources.

Both approaches are supported with automatic endpoint normalization.

Initialize with model slug

from langchain_baseten import ChatBaseten

# Option 1: Use Model APIs with model slug
model = ChatBaseten(
    model="moonshotai/Kimi-K2-Instruct-0905",  # Choose from available model slugs: https://docs.baseten.co/development/model-apis/overview#supported-models
    api_key="your-api-key",  # Or set BASETEN_API_KEY env var
)

Initialize with model URL

from langchain_baseten import ChatBaseten

# Option 2: Use dedicated deployments with model url
model = ChatBaseten(
    model_url="https://model-<id>.api.baseten.co/environments/production/predict",
    api_key="your-api-key",  # Or set BASETEN_API_KEY env var
)

Invocation

Basic invocation

# Use the chat model
response = model.invoke("Hello, how are you?")
print(response.content)

content="Hello! I'm doing well, thank you for asking! How about you?" additional_kwargs={} response_metadata={'finish_reason': 'stop'} id='run--908651ec-00d7-4992-a320-864397c14e37-0'

You can also use message objects for more complex conversations:

messages = [
    {"role": "system", "content": "You are a poetry expert"},
    {"role": "user", "content": "Write a haiku about spring"},
]
response = model.invoke(messages)
print(response)

content='Buds yawn open wide—  \na robin stitches the hush  \nwith threads of first light.' additional_kwargs={} response_metadata={'finish_reason': 'stop'} id='run--6f7d1db7-daae-4628-a40a-2ab7323e8f15-0'

Full guides are available on chat model invocation types, message types, and content blocks.

API reference

For detailed documentation of all ChatBaseten features and configurations, head to the API reference.

Edit the source of this page on GitHub.

Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.

Popular Providers

Integrations by component

Overview

Details

Features

Setup

Credentials

Installation

Instantiation

Invocation

API reference

Popular Providers

Integrations by component

​Overview

​Details

​Features

​Setup

​Credentials

​Installation

​Instantiation

​Invocation

​API reference

Overview

Details

Features

Setup

Credentials

Installation

Instantiation

Invocation

API reference