langchain-google-genai
package provides the LangChain integration for these models. This is often the best starting point for individual developers.
For information on the latest models, their features, context windows, etc. head to the Google AI docs. All model ids can be found in the Gemini API docs.
Integration details
Class | Package | Local | Serializable | JS support | Downloads | Version |
---|---|---|---|---|---|---|
ChatGoogleGenerativeAI | langchain-google-genai | ❌ | beta | ✅ |
Model features
Tool calling | Structured output | JSON mode | Image input | Audio input | Video input | Token-level streaming | Native async | Token usage | Logprobs |
---|---|---|---|---|---|---|---|---|---|
✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
Setup
To access Google AI models you’ll need to create a Google Account, get a Google AI API key, and install thelangchain-google-genai
integration package.
1. Installation:
Chat Models
Use theChatGoogleGenerativeAI
class to interact with Google’s chat models. See the API reference for full details.
Instantiation
Now we can instantiate our model object and generate chat completions:Invocation
Chaining
We can chain our model with a prompt template like so:Multimodal Usage
Gemini models can accept multimodal inputs (text, images, audio, video) and, for some models, generate multimodal outputs.Image Input
Provide image inputs along with text using aHumanMessage
with a list content format. Make sure to use a model that supports image input, such as gemini-2.5-flash
.
image_url
formats:
- A Google Cloud Storage URI (
gs://...
). Ensure the service account has access. - A PIL Image object (the library handles encoding).
Audio Input
Provide audio file inputs along with text.Video Input
Provide video file inputs along with text.Image Generation (Multimodal Output)
Certain models (such asgemini-2.0-flash-preview-image-generation
) can generate text and images inline. You need to specify the desired response_modalities
. See more information on the Gemini API docs for details.
Tool Calling
You can equip the model with tools to call.Structured Output
Force the model to respond with a specific structure using Pydantic models.Structured Output Methods
Two methods are supported for structured output:method="function_calling"
(default): Uses tool calling to extract structured data. Compatible with all Gemini models.method="json_schema"
ormethod="json_mode"
: Uses Gemini’s native structured output withresponseSchema
. More reliable but requires Gemini 1.5+ models. (json_mode
is kept for backwards compatibility)
json_schema
method is recommended for better reliability as it constrains the model’s generation process directly rather than relying on post-processing tool calls.
Token Usage Tracking
Access token usage information from the response metadata.Built-in tools
Google Gemini supports a variety of built-in tools (google search, code execution), which can be bound to the model in the usual way.Native Async
Use asynchronous methods for non-blocking calls.Safety Settings
Gemini models have default safety settings that can be overridden. If you are receiving lots of “Safety Warnings” from your models, you can try tweaking thesafety_settings
attribute of the model. For example, to turn off safety blocking for dangerous content, you can construct your LLM as follows: