Google Cloud BigQuery Vector Search lets you use GoogleSQL to do semantic search, using vector indexes for fast approximate results, or using brute force for exact results.This tutorial illustrates how to work with an end-to-end data and embedding management system in LangChain, and provides a scalable semantic search in BigQuery using the
BigQueryVectorStore
class. This class is part of a set of 2 classes capable of providing a unified data storage and flexible vector search in Google Cloud:
BigQueryVectorStore
class, which is ideal for rapid prototyping with no infrastructure setup and batch retrieval.VertexFSVectorStore
class, enables low-latency retrieval with manual or scheduled data sync. Perfect for production-ready user-facing GenAI applications.gcloud config list
.gcloud projects list
.REGION
variable used by BigQuery. Learn more about BigQuery regions.
gcloud services enable aiplatform.googleapis.com --project {PROJECT_ID}
(replace {PROJECT_ID}
with the name of your project).
You can use any LangChain embeddings model.
batch_search
method for scalable Vector similarity search.
add_texts_with_embeddings
method.
This is particularly useful for multimodal data which might require custom preprocessing before the embedding generation.
.to_vertex_fs_vector_store()
to get a VertexFSVectorStore object, which offers low latency for online use cases. All mandatory parameters will be automatically transferred from the existing BigQueryVectorStore class. See the class definition for all the other parameters you can use.
Moving back to BigQueryVectorStore is equivalently easy with the .to_bq_vector_store()
method.