Access Google’s Generative AI models, including the Gemini family, directly via the Gemini API or experiment rapidly using Google AI Studio. This is often the best starting point for individual developers.
For information on the latest models, model IDs, their features, context windows, etc. head to the Google AI docs.
API ReferenceFor detailed documentation of all features and configuration options, head to the ChatGoogleGenerativeAI API reference.
Overview
Integration details
Model features
Setup
To access Google AI models you’ll need to create a Google Account, get a Google AI API key, and install the langchain-google-genai integration package.
Installation
pip install -U langchain-google-genai
Credentials
Head to the Google AI Studio to generate a Google AI API key. Once you’ve done this set the GOOGLE_API_KEY environment variable in your environment:
import getpass
import os
if "GOOGLE_API_KEY" not in os.environ:
os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter your Google AI API key: ")
To enable automated tracing of your model calls, set your LangSmith API key:
os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
os.environ["LANGSMITH_TRACING"] = "true"
Instantiation
Now we can instantiate our model object and generate responses:
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(
model="gemini-2.5-flash",
temperature=0,
max_tokens=None,
timeout=None,
max_retries=2,
# other params...
)
See the ChatGoogleGenerativeAI API Reference for the full set of available model parameters.
Invocation
messages = [
(
"system",
"You are a helpful assistant that translates English to French. Translate the user sentence.",
),
("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg
AIMessage(content="J'adore la programmation.", additional_kwargs={}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-2.5-flash', 'safety_ratings': []}, id='run-3b28d4b8-8a62-4e6c-ad4e-b53e6e825749-0', usage_metadata={'input_tokens': 20, 'output_tokens': 7, 'total_tokens': 27, 'input_token_details': {'cache_read': 0}})
J'adore la programmation.
Gemini 3 series models will always return a list of content blocks to capture thought signatures. Use the .text property to recover string content.from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(model="gemini-3-pro-preview")
response = llm.invoke("Hello")
response.content # [{"type": "text", "text": "Hello!", "extras": {"signature": "EpQFCp...lKx64r"}}]
response.text # "Hello!"
Multimodal usage
Gemini models can accept multimodal inputs (text, images, audio, video) and, for some models, generate multimodal outputs.
Provide image inputs along with text using a HumanMessage with a list content format.
Make sure to use a model that supports image input, such as gemini-2.5-flash.
import base64
from langchain.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI
# Example using a public URL (remains the same)
message_url = HumanMessage(
content=[
{
"type": "text",
"text": "Describe the image at the URL.",
},
{"type": "image_url", "image_url": "https://picsum.photos/seed/picsum/200/300"},
]
)
result_url = llm.invoke([message_url])
print(f"Response for URL image: {result_url.content}")
# Example using a local image file encoded in base64
image_file_path = "/Users/philschmid/projects/google-gemini/langchain/docs/static/img/agents_vs_chains.png"
with open(image_file_path, "rb") as image_file:
encoded_image = base64.b64encode(image_file.read()).decode("utf-8")
message_local = HumanMessage(
content=[
{"type": "text", "text": "Describe the local image."},
{"type": "image_url", "image_url": f"data:image/png;base64,{encoded_image}"},
]
)
result_local = llm.invoke([message_local])
print(f"Response for local image: {result_local.content}")
Other supported image_url formats:
- A Google Cloud Storage URI (
gs://...). Ensure the service account has access.
- A PIL Image object (the library handles encoding).
Provide audio file inputs along with text.
import base64
from langchain.messages import HumanMessage
# Ensure you have an audio file named 'example_audio.mp3' or provide the correct path.
audio_file_path = "example_audio.mp3"
audio_mime_type = "audio/mpeg"
with open(audio_file_path, "rb") as audio_file:
encoded_audio = base64.b64encode(audio_file.read()).decode("utf-8")
message = HumanMessage(
content=[
{"type": "text", "text": "Transcribe the audio."},
{
"type": "media",
"data": encoded_audio, # Use base64 string directly
"mime_type": audio_mime_type,
},
]
)
response = llm.invoke([message]) # Uncomment to run
print(f"Response for audio: {response.content}")
Provide video file inputs along with text.
import base64
from langchain.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI
# Ensure you have a video file named 'example_video.mp4' or provide the correct path.
video_file_path = "example_video.mp4"
video_mime_type = "video/mp4"
with open(video_file_path, "rb") as video_file:
encoded_video = base64.b64encode(video_file.read()).decode("utf-8")
message = HumanMessage(
content=[
{"type": "text", "text": "Describe the first few frames of the video."},
{
"type": "media",
"data": encoded_video, # Use base64 string directly
"mime_type": video_mime_type,
},
]
)
response = llm.invoke([message]) # Uncomment to run
print(f"Response for video: {response.content}")
Image generation
Certain models (such as gemini-2.5-flash-image) can generate text and images inline.
See more information on the Gemini API docs for details.
# Running in a Jupyter notebook environment
import base64
from IPython.display import Image, display
from langchain.messages import AIMessage
from langchain_google_genai import ChatGoogleGenerativeAI, Modality
llm = ChatGoogleGenerativeAI(model="models/gemini-2.5-flash-image")
message = {
"role": "user",
"content": "Generate a photorealistic image of a cuddly cat wearing a hat.",
}
response = llm.invoke(
[message],
response_modalities=[Modality.TEXT, Modality.IMAGE],
)
def _get_image_base64(response: AIMessage) -> None:
image_block = next(
block
for block in response.content
if isinstance(block, dict) and block.get("image_url")
)
return image_block["image_url"].get("url").split(",")[-1]
image_base64 = _get_image_base64(response)
display(Image(data=base64.b64decode(image_base64), width=300))
Audio generation
Certain models (such as gemini-2.5-flash-preview-tts) can generate audio files.
See more information on the Gemini API docs for details.
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash-preview-tts")
response = llm.invoke(
"Please say The quick brown fox jumps over the lazy dog",
generation_config=dict(response_modalities=["AUDIO"]),
)
# Base64 encoded binary data of the audio
wav_data = response.additional_kwargs.get("audio")
with open("output.wav", "wb") as f:
f.write(wav_data)
You can equip the model with tools to call.
from langchain.tools import tool
from langchain.messages import HumanMessage, ToolMessage
from langchain_google_genai import ChatGoogleGenerativeAI
# Define the tool
@tool(description="Get the current weather in a given location")
def get_weather(location: str) -> str:
return "It's sunny."
# Initialize and bind (potentially multiple) tools to the model
model_with_tools = ChatGoogleGenerativeAI(model="gemini-2.5-flash-lite").bind_tools([get_weather])
# Step 1: Model generates tool calls
messages = [HumanMessage("What's the weather in Boston?")]
ai_msg = model_with_tools.invoke(messages)
messages.append(ai_msg)
# Check the tool calls in the response
print(ai_msg.tool_calls)
# Step 2: Execute tools and collect results
for tool_call in ai_msg.tool_calls:
# Execute the tool with the generated arguments
tool_result = get_weather.invoke(tool_call)
messages.append(tool_result)
# Step 3: Pass results back to model for final response
final_response = model_with_tools.invoke(messages)
[{'name': 'get_weather', 'args': {'location': 'Boston'}, 'id': 'fb91e46d-e3f7-445b-a62f-50ae024bcdac', 'type': 'tool_call'}]
AIMessage(content='The weather in Boston is sunny.', additional_kwargs={}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-2.5-flash-lite', 'safety_ratings': [], 'model_provider': 'google_genai'}, id='lc_run--3fb38729-285b-4b43-aa3e-499cbc910544-0', usage_metadata={'input_tokens': 83, 'output_tokens': 7, 'total_tokens': 90, 'input_token_details': {'cache_read': 0}})
Structured output
Force the model to respond with a specific structure. See the Gemini API docs for more info.
from langchain_google_genai import ChatGoogleGenerativeAI
from pydantic import BaseModel
from typing import Literal
class Feedback(BaseModel):
sentiment: Literal["positive", "neutral", "negative"]
summary: str
llm = ChatGoogleGenerativeAI(model="gemini-2.5-pro")
structured_llm = llm.with_structured_output(
schema=Feedback.model_json_schema(), method="json_schema"
)
response = structured_llm.invoke("The new UI is great!")
response["sentiment"] # "positive"
response["summary"] # "The user expresses positive..."
For streaming structured output, merge dictionaries instead of using +=:
stream = structured_llm.stream("The interface is intuitive and beautiful!")
full = next(stream)
for chunk in stream:
full.update(chunk) # Merge dictionaries
print(full) # Complete structured response
Structured output methods
Two methods are supported for structured output:
method="function_calling" (default): Uses tool calling to extract structured data.
method="json_mode": Uses Gemini’s native structured output.
The json_mode method is recommended for better reliability as it constrains the model’s generation process directly rather than relying on post-processing tool calls.
Token usage tracking
Access token usage information from the response metadata.
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash-lite")
result = llm.invoke("Explain the concept of prompt engineering in one sentence.")
print(result.content)
print("\nUsage Metadata:")
print(result.usage_metadata)
Prompt engineering is the art and science of crafting effective text prompts to elicit desired and accurate responses from large language models.
Usage Metadata:
{'input_tokens': 10, 'output_tokens': 24, 'total_tokens': 34, 'input_token_details': {'cache_read': 0}}
Google Gemini supports a variety of built-in tools, which can be bound to the model in the usual way.
Google search
See Gemini docs for detail.
from langchain_google_genai import ChatGoogleGenerativeAI
model = ChatGoogleGenerativeAI(model="gemini-2.5-flash-lite")
model_with_search = model.bind_tools([{"google_search": {}}])
response = model_with_search.invoke("When is the next total solar eclipse in US?")
response.content_blocks
[{'type': 'text',
'text': 'The next total solar eclipse visible in the contiguous United States will occur on...',
'annotations': [{'type': 'citation',
'id': 'abc123',
'url': '<url for source 1>',
'title': '<source 1 title>',
'start_index': 0,
'end_index': 99,
'cited_text': 'The next total solar eclipse...',
'extras': {'google_ai_metadata': {'web_search_queries': ['next total solar eclipse in US'],
'grounding_chunk_index': 0,
'confidence_scores': []}}},
{'type': 'citation',
'id': 'abc234',
'url': '<url for source 2>',
'title': '<source 2 title>',
'start_index': 0,
'end_index': 99,
'cited_text': 'The next total solar eclipse...',
'extras': {'google_ai_metadata': {'web_search_queries': ['next total solar eclipse in US'],
'grounding_chunk_index': 1,
'confidence_scores': []}}}]}]
Code execution
See Gemini docs for detail.
from langchain_google_genai import ChatGoogleGenerativeAI
model = ChatGoogleGenerativeAI(model="gemini-2.5-flash-lite")
model_with_code_interpreter = model.bind_tools([{"code_execution": {}}])
response = model_with_code_interpreter.invoke("Use Python to calculate 3^3.")
response.content_blocks
[{'type': 'server_tool_call',
'name': 'code_interpreter',
'args': {'code': 'print(3**3)', 'language': <Language.PYTHON: 1>},
'id': '...'},
{'type': 'server_tool_result',
'tool_call_id': '',
'status': 'success',
'output': '27\n',
'extras': {'block_type': 'code_execution_result',
'outcome': <Outcome.OUTCOME_OK: 1>}},
{'type': 'text', 'text': 'The calculation of 3 to the power of 3 is 27.'}]
Thinking Support
See the Gemini API docs for more info.
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(
model="models/gemini-2.5-flash",
thinking_budget=1024,
include_thoughts=True,
)
response = llm.invoke("How many O's are in Google? How did you verify your answer?")
reasoning_score = response.usage_metadata["output_token_details"]["reasoning"]
print("Response:", response.content)
print("Reasoning tokens used:", reasoning_score)
Thought signatures
Thought signatures are encrypted representations of the model’s reasoning processes. Gemini 2.5 and 3 series models may return thought signatures in their responses.
Gemini 3 may raise 4xx errors if thought signatures are not passed back with tool call responses. Upgrade to langchain-google-genai >= 3.1.0 to ensure this is handled correctly.
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(
model="models/gemini-3-pro-preview",
thinking_budget=1024,
include_thoughts=True,
)
response = llm.invoke("How many O's are in Google? How did you verify your answer?")
response.content_blocks[-1]
# {"type": "text", "text": "...", "extras": {"signature": "EtgVCt...mc0w=="}}
Safety settings
Gemini models have default safety settings that can be overridden. If you are receiving lots of “Safety Warnings” from your models, you can try tweaking the safety_settings attribute of the model. For example, to turn off safety blocking for dangerous content, you can construct your LLM as follows:
from langchain_google_genai import (
ChatGoogleGenerativeAI,
HarmBlockThreshold,
HarmCategory,
)
llm = ChatGoogleGenerativeAI(
model="gemini-2.5-pro",
safety_settings={
HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE,
},
)
For an enumeration of the categories and thresholds available, see Google’s safety setting types.
API reference
For detailed documentation of all features and configuration options, head to the ChatGoogleGenerativeAI API reference.