Google Vertex AI Search (formerly known asEnterprise Search
onGenerative AI App Builder
) is a part of the Vertex AI machine learning platform offered byGoogle Cloud
.Vertex AI Search
lets organizations quickly build generative AI-powered search engines for customers and employees. It’s underpinned by a variety ofGoogle Search
technologies, including semantic search, which helps deliver more relevant results than traditional keyword-based search techniques by using natural language processing and machine learning techniques to infer relationships within the content and intent from the user’s query input. Vertex AI Search also benefits from Google’s expertise in understanding how users search and factors in content relevance to order displayed results.
This notebook demonstrates how to configureVertex AI Search
is available in theGoogle Cloud Console
and via an API for enterprise workflow integration.
Vertex AI Search
and use the Vertex AI Search retriever. The Vertex AI Search retriever encapsulates the Python client library and uses it to access the Search Service API.
For detailed documentation of all VertexAISearchRetriever
features and configurations head to the API reference.
langchain-google-community
and google-cloud-discoveryengine
packages to use the Vertex AI Search retriever.
gs://cloud-samples-data/gen-app-builder/search/alphabet-investor-pdfs
Cloud Storage folder.Cloud Storage (without metadata)
option.google.colab.google.auth
otherwise follow one of the supported methods to make sure that you Application Default Credentials are properly set.
langchain_google_community.VertexAISearchRetriever
class. The get_relevant_documents
method returns a list of langchain.schema.Document
documents where the page_content
field of each document is populated the document content.
Depending on the data type used in Vertex AI Search (website, structured or unstructured) the page_content
field is populated as follows:
extractive answer
that matches a query. The metadata
field is populated with metadata (if any) of the document from which the segments or answers were extracted.extractive segment
or an extractive answer
that matches a query. The metadata
field is populated with metadata (if any) of the document from which the segments or answers were extracted.metadata
field is populated with metadata (if any) of the documentproject_id
- Your Google Cloud Project ID.location_id
- The location of the data store.
global
(default)us
eu
search_engine_id
- The ID of the search app you want to use. (Required for Blended Search)data_store_id
- The ID of the data store you want to use.project_id
, search_engine_id
and data_store_id
parameters can be provided explicitly in the retriever’s constructor or through the environment variables - PROJECT_ID
, SEARCH_ENGINE_ID
and DATA_STORE_ID
.
You can also configure a number of optional parameters, including:
max_documents
- The maximum number of documents used to provide extractive segments or extractive answersget_extractive_answers
- By default, the retriever is configured to return extractive segments.
True
to return extractive answers. This is used only when engine_data_type
set to 0
(unstructured)max_extractive_answer_count
- The maximum number of extractive answers returned in each search result.
engine_data_type
set to 0
(unstructured).max_extractive_segment_count
- The maximum number of extractive segments returned in each search result.
engine_data_type
set to 0
(unstructured).filter
- The filter expression for the search results based on the metadata associated with the documents in the data store.query_expansion_condition
- Specification to determine under which conditions query expansion should occur.
0
- Unspecified query expansion condition. In this case, server behavior defaults to disabled.1
- Disabled query expansion. Only the exact search query is used, even if SearchResponse.total_size is zero.2
- Automatic query expansion built by the Search API.engine_data_type
- Defines the Vertex AI Search data type
0
- Unstructured data1
- Structured data2
- Website data3
- Blended searchGoogleCloudEnterpriseSearchRetriever
GoogleCloudEnterpriseSearchRetriever
.
To update to the new retriever, make the following changes:
from langchain.retrievers import GoogleCloudEnterpriseSearchRetriever
-> from langchain_google_community import VertexAISearchRetriever
.GoogleCloudEnterpriseSearchRetriever
-> VertexAISearchRetriever
..invoke
to issue a single query. Because retrievers are Runnables, we can use any method in the Runnable interface, such as .batch
, as well.
VertexAISearchRetriever
features and configurations head to the API reference.