This doc will help you get started with OCI Generative AI embedding models. Oracle Cloud Infrastructure (OCI) Generative AI provides state-of-the-art embedding models for text and images, enabling semantic search, RAG, clustering, and cross-modal applications.
For detailed documentation, see the OCI Generative AI documentation and API reference.
Overview
Integration details
| Class | Package | Serializable | Caches at params | Async | Downloads | Version |
|---|
OCIGenAIEmbeddings | langchain-oci | beta | ✅ | ✅ |  |  |
Model features
| Text embeddings | Image embeddings | Multimodal | Batch operations | Async |
|---|
| ✅ | ✅ | ✅ | ✅ | ✅ |
Setup
pip install -qU langchain-oci oci
Set up authentication:
Instantiation
from langchain_oci import OCIGenAIEmbeddings
embeddings = OCIGenAIEmbeddings(
model_id="cohere.embed-english-v3.0",
service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
compartment_id="ocid1.compartment.oc1..your-compartment-id",
)
Usage
Build semantic search over technical documentation:
# Index code documentation
docs = [
"authenticate() validates JWT tokens and returns user object",
"authorize() checks user permissions for resource access",
"audit_log() records user actions for compliance tracking"
]
doc_vectors = embeddings.embed_documents(docs)
# Search with natural language query
query = "How do I verify user identity?"
query_vector = embeddings.embed_query(query)
# Find most relevant documentation
import numpy as np
similarities = [
np.dot(query_vector, doc_vec) /
(np.linalg.norm(query_vector) * np.linalg.norm(doc_vec))
for doc_vec in doc_vectors
]
best_match = docs[np.argmax(similarities)]
# Returns: "authenticate() validates JWT tokens..."
Use cases: Code search, documentation Q&A, log analysis, duplicate detection
Image Embeddings
Search visual assets with text queries using multimodal embeddings:
import numpy as np
from langchain_oci import OCIGenAIEmbeddings
embeddings = OCIGenAIEmbeddings(
model_id="cohere.embed-v4.0",
service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
compartment_id="ocid1.compartment.oc1..your-compartment-id",
)
# Index architecture diagrams
diagrams = ["microservices.png", "database_schema.png", "network.png"]
image_vectors = embeddings.embed_image_batch(diagrams)
# Search with text query
query_vector = embeddings.embed_query("database relationships")
# Find best match
similarities = [np.dot(query_vector, v) / (np.linalg.norm(query_vector) * np.linalg.norm(v))
for v in image_vectors]
print(diagrams[np.argmax(similarities)]) # "database_schema.png"
Use cases: Technical diagram search, asset management, visual documentation retrieval
Available Models
| Model | Dimensions | Type |
|---|
cohere.embed-english-v3.0 | 1024 | Text only |
cohere.embed-multilingual-v3.0 | 1024 | Text only |
cohere.embed-v4.0 | 256-1536 | Text + Image |
See the OCI model catalog for all models.
RAG Example
The langchain-community package is no longer maintained. Examples that import from langchain_community may be outdated or broken. Use with caution.
from langchain_community.vectorstores import FAISS
# Create vector store
vectorstore = FAISS.from_documents(documents, embeddings)
# Search
results = vectorstore.similarity_search("your query", k=3)
Async
Async operations for production use:
query_vector = await embeddings.aembed_query("What is AI?")
doc_vectors = await embeddings.aembed_documents(["Doc 1", "Doc 2"])
API Reference
For detailed documentation of all OCIGenAIEmbeddings features and configurations, head to the API reference.