OCI Generative AI Integration for LangChain

This page covers all LangChain integrations with Oracle Cloud Infrastructure (OCI).

Installation and Setup

pip install langchain-oci oci

Authentication

Initializing ChatOCIGenAI is resource-intensive. For optimal performance, it is recommended to treat this client as a singleton and reuse the instance across your application.

Four authentication methods are supported for OCI services. All methods follow the standard OCI SDK authentication.

API Key (Default)

Uses credentials from ~/.oci/config:

from langchain_oci import ChatOCIGenAI

llm = ChatOCIGenAI(
    model_id="meta.llama-3.3-70b-instruct",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="ocid1.compartment.oc1..xxx",
    auth_type="API_KEY",        # Default
    auth_profile="DEFAULT",      # Profile name in ~/.oci/config
)

Security Token

For session-based authentication:

oci session authenticate --profile-name MY_PROFILE

llm = ChatOCIGenAI(
    model_id="meta.llama-3.3-70b-instruct",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="ocid1.compartment.oc1..your-compartment-id",
    auth_type="SECURITY_TOKEN",
    auth_profile="MY_PROFILE",
)

Instance Principal

For applications running on OCI compute instances:

llm = ChatOCIGenAI(
    model_id="meta.llama-3.3-70b-instruct",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="ocid1.compartment.oc1..your-compartment-id",
    auth_type="INSTANCE_PRINCIPAL",
)

Resource Principal

For OCI Functions and other resources:

llm = ChatOCIGenAI(
    model_id="meta.llama-3.3-70b-instruct",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="ocid1.compartment.oc1..your-compartment-id",
    auth_type="RESOURCE_PRINCIPAL",
)

OCI Generative AI

Oracle Cloud Infrastructure (OCI) Generative AI is a fully managed service that provides a set of state-of-the-art, customizable large language models (LLMs) that cover a wide range of use cases, and which are available through a single API. Using the OCI Generative AI service you can access ready-to-use pretrained models, or create and host your own fine-tuned custom models based on your own data on dedicated AI clusters.

Chat Models

ChatOCIGenAI

Main chat model for OCI Generative AI service with full LangChain feature support. See usage example.

from langchain_oci import ChatOCIGenAI

llm = ChatOCIGenAI(
    model_id="meta.llama-3.3-70b-instruct",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="ocid1.compartment.oc1..xxx",
)

Supported Features:

✅ Tool calling (including parallel tools with Llama 4+)
✅ Structured output (Pydantic, JSON mode)
✅ Vision & multimodal (13+ models support images)
✅ Streaming (sync and async)
✅ Async operations (ainvoke, astream, abatch)
✅ PDF, video, audio processing (Gemini models)

ChatOCIOpenAI

OpenAI Responses API compatibility for OCI commercial OpenAI models.

from langchain_oci import ChatOCIOpenAI

llm = ChatOCIOpenAI(
    model="openai.gpt-5.4",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="ocid1.compartment.oc1..your-compartment-id",
)

Features:

OpenAI-compatible interface
Conversation store support for persistent memory
Access to GPT-4, GPT-5, o1, o3 models (where available)

Embedding Models

OCIGenAIEmbeddings

Text and image embedding models. See usage example.

from langchain_oci import OCIGenAIEmbeddings

# Text embeddings
embeddings = OCIGenAIEmbeddings(
    model_id="cohere.embed-english-v3.0",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="ocid1.compartment.oc1..xxx",
)

# Image embeddings (with multimodal models)
vector = embeddings.embed_image("./architecture_diagram.png")

Available Models:

cohere.embed-english-v3.0 (1024 dimensions)
cohere.embed-multilingual-v3.0 (1024 dimensions)
cohere.embed-v4.0 (text + image, 256-1536 dimensions)

Vision & Multimodal

13+ vision-capable models across Meta Llama, Google Gemini, xAI Grok, and Cohere:

from langchain.messages import HumanMessage
from langchain_oci import ChatOCIGenAI, load_image

llm = ChatOCIGenAI(
    model_id="meta.llama-3.2-90b-vision-instruct",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="ocid1.compartment.oc1..xxx",
)

message = HumanMessage(content=[
    {"type": "text", "text": "Identify all microservices, data flows, and external dependencies."},
    load_image("./architecture_diagram.png"),
])
response = llm.invoke([message])

Gemini Multimodal (PDF, video, audio):

import base64
from langchain.messages import HumanMessage
from langchain_oci import ChatOCIGenAI

llm = ChatOCIGenAI(
    model_id="google.gemini-2.5-flash",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="ocid1.compartment.oc1..your-compartment-id",
)

# PDF processing - Extract structured data from contracts
with open("vendor_contract.pdf", "rb") as f:
    pdf_data = base64.b64encode(f.read()).decode()

message = HumanMessage(content=[
    {"type": "text", "text": "Extract: parties, effective date, payment terms. Return as JSON."},
    {"type": "document_url", "document_url": {"url": f"data:application/pdf;base64,{pdf_data}"}}
])
response = llm.invoke([message])

AI Agents

Create LangGraph-powered ReAct agents with OCI models:

from langchain.tools import tool
from langchain_oci import create_oci_agent

@tool
def query_infrastructure(resource_type: str, region: str) -> dict:
    """Query OCI infrastructure status and health metrics.

    Args:
        resource_type: Type of resource (compute, database, network)
        region: OCI region to query
    """
    # Example: Call OCI monitoring API
    return {
        "status": "healthy",
        "active_instances": 12,
        "cpu_utilization": "45%",
        "alerts": []
    }

agent = create_oci_agent(
    model_id="meta.llama-4-scout-17b-16e-instruct",
    tools=[query_infrastructure],
    compartment_id="ocid1.compartment.oc1..xxx",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    system_prompt="You are an infrastructure monitoring assistant.",
)

from langchain.messages import HumanMessage
result = agent.invoke({
    "messages": [HumanMessage(content="Check compute resource health in us-ashburn-1")]
})

Provider Coverage

Provider	Example Models	Key Features
Meta	Llama 3.2, 3.3, 4 (Scout, Maverick)	Vision, parallel tool calls
Google	Gemini 2.0/2.5 Flash, Flash Lite, Pro	PDF, video, audio processing
xAI	Grok 3, 4 (Fast, Mini)	Vision, reasoning modes
Cohere	Command R+, Command A	RAG optimization, V2 vision
OpenAI	GPT-5.4, GPT-5, o1, o3	Reasoning (via ChatOCIOpenAI)

Note: Model availability varies by region. See the OCI Generative AI documentation for the current model catalog.

OCI Data Science Model Deployments

OCI Data Science is a fully managed and serverless platform for data science teams. Deploy custom models as endpoints using the OCI Data Science Model Deployment Service.

ChatOCIModelDeployment

Chat model for OCI Data Science Model Deployments. See the ChatOCIModelDeployment integration guide.

from langchain_oci import ChatOCIModelDeployment

llm = ChatOCIModelDeployment(
    endpoint="https://modeldeployment.us-ashburn-1.oci.customer-oci.com/ocid1.datasciencemodeldeployment.oc1.iad.xxx/predict",
    model="odsc-llm",
)

ChatOCIModelDeploymentVLLM

Optimized for vLLM-based deployments with streaming support:

from langchain_oci import ChatOCIModelDeploymentVLLM

llm = ChatOCIModelDeploymentVLLM(
    endpoint="https://modeldeployment.us-ashburn-1.oci.customer-oci.com/ocid1.datasciencemodeldeployment.oc1.iad.xxx/predict",
    model="meta-llama/Llama-2-7b-chat-hf",
    streaming=True,
)

ChatOCIModelDeploymentTGI

For Text Generation Inference (TGI) deployments:

from langchain_oci import ChatOCIModelDeploymentTGI

llm = ChatOCIModelDeploymentTGI(
    endpoint="https://modeldeployment.us-ashburn-1.oci.customer-oci.com/ocid1.datasciencemodeldeployment.oc1.iad.xxx/predict",
)

Samples

For comprehensive guides covering all features, see the langchain-oci samples:

Sample	Level	Topics
01: Getting Started	Beginner	Authentication, basic chat, providers
02: Vision & Multimodal	Beginner	Image analysis, PDF, video, audio
03: Building AI Agents	Intermediate	ReAct agents, tools, memory
04: Tool Calling Mastery	Intermediate	Parallel tools, workflows
05: Structured Output	Intermediate	Pydantic schemas, JSON modes
07: Async for Production	Advanced	ainvoke, astream, FastAPI
09: Provider Deep Dive	Specialized	Meta, Gemini, Cohere, xAI specifics
10: Embeddings	Specialized	Text & image embeddings, RAG

Additional Resources

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

Edit this page on GitHub or file an issue.

Documentation Index

​Installation and Setup

​Authentication

​API Key (Default)

​Security Token

​Instance Principal

​Resource Principal

​OCI Generative AI

​Chat Models

​ChatOCIGenAI

​ChatOCIOpenAI

​Embedding Models

​OCIGenAIEmbeddings

​Vision & Multimodal

​AI Agents

​Provider Coverage

​OCI Data Science Model Deployments

​ChatOCIModelDeployment

​ChatOCIModelDeploymentVLLM

​ChatOCIModelDeploymentTGI

​Samples

​Additional Resources

Installation and Setup

Authentication

API Key (Default)

Security Token

Instance Principal

Resource Principal

OCI Generative AI

Chat Models

ChatOCIGenAI

ChatOCIOpenAI

Embedding Models

OCIGenAIEmbeddings

Vision & Multimodal

AI Agents

Provider Coverage

OCI Data Science Model Deployments

ChatOCIModelDeployment

ChatOCIModelDeploymentVLLM

ChatOCIModelDeploymentTGI

Samples

Additional Resources