Skip to main content
This will help you get started with Google Generative AI embedding models using LangChain. For detailed documentation on GoogleGenerativeAIEmbeddings features and configuration options, please refer to the API reference.

Overview

gemini-embedding-2-preview natively supports text, image, video, audio, and PDF inputs via the Google GenAI SDK’s embed_content() API. However, the LangChain Embeddings interface (embed_query / embed_documents) currently only accepts text inputs. Multimodal embedding support in LangChain is planned for a future release. For multimodal use cases today, use the Google GenAI SDK directly.

Integration details

Setup

To access Google Gemini embedding models you’ll need to create a Google Cloud project, enable the Generative Language API, get an API key, and install the langchain-google-genai integration package.

Credentials

Head to Google AI Studio to sign up and generate an API key. See the Gemini API keys documentation for more details. Once you’ve done this set the GOOGLE_API_KEY environment variable:
import getpass
import os

if not os.getenv("GOOGLE_API_KEY"):
    os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter your Google API key: ")
To enable automated tracing of your model calls, set your LangSmith API key:
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")

Installation

The LangChain Google Generative AI integration lives in the langchain-google-genai package:
pip install -qU langchain-google-genai

Instantiation

Now we can instantiate our model object and generate embeddings:
from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(model="gemini-embedding-2-preview")
vector = embeddings.embed_query("hello, world!")
vector[:5]
[-0.024917153641581535,
 0.012005362659692764,
 -0.003886754624545574,
 -0.05774897709488869,
 0.0020742062479257584]

Reduced dimensionality

gemini-embedding-2-preview supports flexible output dimensions via Matryoshka Representation Learning (MRL). You can reduce dimensionality to optimize storage and latency:
embeddings = GoogleGenerativeAIEmbeddings(
    model="gemini-embedding-2-preview",
    output_dimensionality=768,  # Suggested: 768, 1536, or 3072 (default)
)
vector = embeddings.embed_query("hello, world!")
len(vector)
768

Batch

You can also embed multiple strings at once for a processing speedup:
vectors = embeddings.embed_documents(
    [
        "Today is Monday",
        "Today is Tuesday",
        "Today is April Fools day",
    ]
)
len(vectors), len(vectors[0])
(3, 768)

Indexing and retrieval

Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our RAG tutorials. Below, see how to index and retrieve data using the embeddings object we initialized above. In this example, we will index and retrieve a sample document in the InMemoryVectorStore.
# Create a vector store with a sample text
from langchain_core.vectorstores import InMemoryVectorStore

text = "LangChain is the framework for building context-aware reasoning applications"

vectorstore = InMemoryVectorStore.from_texts(
    [text],
    embedding=embeddings,
)

# Use the vectorstore as a retriever
retriever = vectorstore.as_retriever()

# Retrieve the most similar text
retrieved_documents = retriever.invoke("What is LangChain?")

# show the retrieved document's content
retrieved_documents[0].page_content
'LangChain is the framework for building context-aware reasoning applications'

Task type

GoogleGenerativeAIEmbeddings optionally support a task_type, which currently must be one of:
  • SEMANTIC_SIMILARITY: Used to generate embeddings that are optimized to assess text similarity.
  • CLASSIFICATION: Used to generate embeddings that are optimized to classify texts according to preset labels.
  • CLUSTERING: Used to generate embeddings that are optimized to cluster texts based on their similarities.
  • RETRIEVAL_DOCUMENT, RETRIEVAL_QUERY, QUESTION_ANSWERING, and FACT_VERIFICATION: Used to generate embeddings that are optimized for document search or information retrieval.
  • CODE_RETRIEVAL_QUERY: Used to retrieve a code block based on a natural language query, such as sort an array or reverse a linked list. Embeddings of the code blocks are computed using RETRIEVAL_DOCUMENT.
By default, we use RETRIEVAL_DOCUMENT in the embed_documents method and RETRIEVAL_QUERY in the embed_query method. If you provide a task type, we will use that for all methods.
pip install -qU matplotlib scikit-learn
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from sklearn.metrics.pairwise import cosine_similarity

query_embeddings = GoogleGenerativeAIEmbeddings(
    model="gemini-embedding-2-preview", task_type="RETRIEVAL_QUERY"
)
doc_embeddings = GoogleGenerativeAIEmbeddings(
    model="gemini-embedding-2-preview", task_type="RETRIEVAL_DOCUMENT"
)

q_embed = query_embeddings.embed_query("What is the capital of France?")
d_embed = doc_embeddings.embed_documents(
    ["The capital of France is Paris.", "Philipp likes to eat pizza."]
)

for i, d in enumerate(d_embed):
    print(f"Document {i + 1}:")
    print(f"Cosine similarity with query: {cosine_similarity([q_embed], [d])[0][0]}")
    print("---")
Document 1:
Cosine similarity with query: 0.7892893360164779
---
Document 2:
Cosine similarity with query: 0.5438283285204146
---

Additional configuration

You can pass the following parameters to GoogleGenerativeAIEmbeddings to customize the SDK’s behavior:
  • base_url: Custom base URL for the API client (e.g., a custom endpoint)
  • output_dimensionality: Reduce the dimensionality of returned embeddings (e.g., output_dimensionality=256)
  • request_options: Request options dict (e.g., {"timeout": 10})
  • additional_headers: Additional HTTP headers to include in API requests
  • client_args: Additional arguments to pass to the underlying HTTP client

API reference

For detailed documentation on GoogleGenerativeAIEmbeddings features and configuration options, please refer to the API reference.