ChatRunPod
class to interact with chat models hosted on RunPod Serverless.
RUNPOD_API_KEY
and RUNPOD_ENDPOINT_ID
(or a specific RUNPOD_CHAT_ENDPOINT_ID
) are set.ChatRunPod
class. You can pass model-specific parameters via model_kwargs
and configure polling behavior.
.invoke()
and .ainvoke()
methods to call the model. Streaming is also supported via .stream()
and .astream()
(simulated by polling the RunPod /stream
endpoint).
ChatRunPod
integration provides the basic framework, but the handler must support the underlying functionality.
Feature | Integration Support | Endpoint Dependent? | Notes |
---|---|---|---|
Tool calling | ❌ | ✅ | Requires handler to process tool definitions and return tool calls (e.g., OpenAI format). Integration needs parsing logic. |
Structured output | ❌ | ✅ | Requires handler support for forcing structured output (JSON mode, function calling). Integration needs parsing logic. |
JSON mode | ❌ | ✅ | Requires handler to accept a json_mode parameter (or similar) and guarantee JSON output. |
Image input | ❌ | ✅ | Requires multimodal handler accepting image data (e.g., base64). Integration does not support multimodal messages. |
Audio input | ❌ | ✅ | Requires handler accepting audio data. Integration does not support audio messages. |
Video input | ❌ | ✅ | Requires handler accepting video data. Integration does not support video messages. |
Token-level streaming | ✅ (Simulated) | ✅ | Polls /stream . Requires handler to populate stream list in status response with token chunks (e.g., [{"output": "token"}] ). True low-latency streaming not built-in. |
Native async | ✅ | ✅ | Core ainvoke /astream implemented. Relies on endpoint handler performance. |
Token usage | ❌ | ✅ | Requires handler to return prompt_tokens , completion_tokens in the final response. Integration currently does not parse this. |
Logprobs | ❌ | ✅ | Requires handler to return log probabilities. Integration currently does not parse this. |
ChatRunPod
class, parameters, and methods, refer to the source code or the generated API reference (if available).
Link to source code: https://github.com/runpod/langchain-runpod/blob/main/langchain_runpod/chat_models.py