Couchbase Query Vector Store

The CouchbaseQueryVectorStore is the preferred implementation of Vector Search in Couchbase. It uses the Query Service (SQL++) and Index Service for vector similarity search, instead of the Search service. This provides a more powerful and straightforward approach for vector operations using SQL++ queries with vector functions. More information about Couchbase’s vector search capabilities can be found in the official documentation: Choose the Right Vector Index.

This functionality is only available in Couchbase 8.0 and above.

Key differences from `CouchbaseSearchVectorStore`

(formerly CouchbaseVectorStore)

Query and Index Service: Uses Couchbase’s Query service with SQL++ instead of the Search service
No Index Required: Does not require a pre-configured search index for basic operations
SQL++ Syntax: Supports WHERE clauses and SQL++ query syntax for filtering
Vector Functions: Uses APPROX_VECTOR_DISTANCE function for similarity calculations
Distance Strategies: Supports multiple distance strategies (Dot Product, Cosine, Euclidean, Euclidean Squared)

Installation

npm

npm install couchbase @langchain/openai @langchain/community @langchain/core

Create couchbase connection object

We create a connection to the Couchbase cluster and then pass the cluster object to the Vector Store. Here, we are connecting using the username and password. You can also connect to your cluster using any other supported method. For more information on connecting to the Couchbase cluster, please check the Node SDK documentation.

import { Cluster } from "couchbase";

const connectionString = "couchbase://localhost";
const dbUsername = "Administrator"; // valid database user with read access to the bucket being queried
const dbPassword = "Password"; // password for the database user

const couchbaseClient = await Cluster.connect(connectionString, {
  username: dbUsername,
  password: dbPassword,
  configProfile: "wanDevelopment",
});

Basic setup

import { CouchbaseQueryVectorStore, DistanceStrategy } from "@langchain/community/vectorstores/couchbase_query";
import { OpenAIEmbeddings } from "@langchain/openai";
import { Cluster } from "couchbase";

// Connect to Couchbase
const cluster = await Cluster.connect("couchbase://localhost", {
  username: "Administrator",
  password: "password",
});

// Initialize embeddings
const embeddings = new OpenAIEmbeddings();

// Configure the vector store
const vectorStore = await CouchbaseQueryVectorStore.initialize(embeddings, {
  cluster,
  bucketName: "my-bucket",
  scopeName: "my-scope",
  collectionName: "my-collection",
  textKey: "text", // optional, defaults to "text"
  embeddingKey: "embedding", // optional, defaults to "embedding"
  distanceStrategy: DistanceStrategy.COSINE, // optional, defaults to DOT
});

Creating vector indexes

The Query vector store supports creating vector indexes to improve search performance. There are two types of indexes available:

Hyperscale index

A specialized vector index optimized for vector operations using Couchbase’s vector indexing capabilities:

import { IndexType } from "@langchain/community/vectorstores/couchbase_query";

await vectorStore.createIndex({
  indexType: IndexType.HYPERSCALE,
  indexDescription: "IVF,SQ8",
  indexName: "my_vector_index", // optional
  vectorDimension: 1536, // optional, auto-detected from embeddings
  distanceMetric: DistanceStrategy.COSINE, // optional, uses store default
  fields: ["text", "metadata"], // optional, defaults to text field
  whereClause: "type = 'document'", // optional filter
  indexScanNprobes: 10, // optional tuning parameter
  indexTrainlist: 1000, // optional tuning parameter
});

Generated SQL++:

CREATE VECTOR INDEX `my_vector_index` ON `bucket`.`scope`.`collection`
(`embedding` VECTOR) INCLUDE (`text`, `metadata`)
WHERE type = 'document' USING GSI WITH {'dimension': 1536, 'similarity': 'cosine', 'description': 'IVF,SQ8'}

Composite index

A general-purpose GSI index that includes vector fields alongside scalar fields:

await vectorStore.createIndex({
  indexType: IndexType.COMPOSITE,
  indexDescription: "IVF1024,SQ8",
  indexName: "my_composite_index",
  vectorDimension: 1536,
  fields: ["text", "metadata.category"],
  whereClause: "created_date > '2023-01-01'",
  indexScanNprobes: 3,
  indexTrainlist: 10000,
});

Generated SQL++:

CREATE INDEX `my_composite_index` ON `bucket`.`scope`.`collection`
(`text`, `metadata.category`, `embedding` VECTOR)
WHERE created_date > '2023-01-01' USING GSI
WITH {'dimension': 1536, 'similarity': 'dot', 'description': 'IVF1024,SQ8', 'scan_nprobes': 3, 'trainlist': 10000}

Key differences

Aspect	Hyperscale Index	Composite Index
SQL++ Syntax	`CREATE VECTOR INDEX`	`CREATE INDEX`
Vector Field	`(field VECTOR)` with `INCLUDE` clause	`(field1, field2, vector_field VECTOR)`
Vector Parameters	Supports all vector parameters	Supports all vector parameters
Optimization	Specialized for vector operations	General-purpose GSI with vector support
Use Case	Pure vector similarity search	Mixed vector and scalar queries
Performance	Optimized for vector distance calculations	Good for hybrid queries
Tuning Parameters	Supports `indexScanNprobes`, `indexTrainlist`	Supports `indexScanNprobes`, `indexTrainlist`
Limitations	Only one vector field, uses INCLUDE for other fields	One vector field among multiple index keys

Basic vector search example

The following example showcases how to use Couchbase Query vector search and perform similarity search.

import { OpenAIEmbeddings } from "@langchain/openai";
import {
  CouchbaseQueryVectorStore,
  DistanceStrategy,
} from "@langchain/community/vectorstores/couchbase_query";
import { Cluster } from "couchbase";
import { Document } from "@langchain/core/documents";

const connectionString = process.env.COUCHBASE_DB_CONN_STR ?? "couchbase://localhost";
const databaseUsername = process.env.COUCHBASE_DB_USERNAME ?? "Administrator";
const databasePassword = process.env.COUCHBASE_DB_PASSWORD ?? "Password";

const couchbaseClient = await Cluster.connect(connectionString, {
  username: databaseUsername,
  password: databasePassword,
  configProfile: "wanDevelopment",
});

// OpenAI API Key is required to use OpenAIEmbeddings
const embeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY,
});

const vectorStore = await CouchbaseQueryVectorStore.initialize(embeddings, {
  cluster: couchbaseClient,
  bucketName: "testing",
  scopeName: "_default",
  collectionName: "_default",
  textKey: "text",
  embeddingKey: "embedding",
  distanceStrategy: DistanceStrategy.COSINE,
});

// Add documents
const documents = [
  new Document({
    pageContent: "Couchbase is a NoSQL database",
    metadata: { category: "database", type: "document" }
  }),
  new Document({
    pageContent: "Vector search enables semantic similarity",
    metadata: { category: "ai", type: "document" }
  })
];

await vectorStore.addDocuments(documents);

// Perform similarity search
const query = "What is a NoSQL database?";
const results = await vectorStore.similaritySearch(query, 4);
console.log("Search results:", results[0]);

// Search with scores
const resultsWithScores = await vectorStore.similaritySearchWithScore(query, 4);
console.log("Document:", resultsWithScores[0][0]);
console.log("Score:", resultsWithScores[0][1]);

Searching documents

Basic similarity search

// Basic similarity search
const results = await vectorStore.similaritySearch(
  "What is a NoSQL database?",
  4
);

Search with filters

// Search with filters
const filteredResults = await vectorStore.similaritySearch(
  "database technology",
  4,
  {
    where: "metadata.category = 'database'",
    fields: ["text", "metadata.category"]
  }
);

Search with scores

// Search with scores
const resultsWithScores = await vectorStore.similaritySearchWithScore(
  "vector search capabilities",
  4
);

Complex filtering

const results = await vectorStore.similaritySearch(
  "search query",
  10,
  {
    where: "metadata.category IN ['tech', 'science'] AND metadata.rating >= 4",
    fields: ["content", "metadata.title", "metadata.rating"]
  }
);

Configuration options

Distance strategies

DistanceStrategy.DOT - Dot Product (default)
DistanceStrategy.COSINE - Cosine Similarity
DistanceStrategy.EUCLIDEAN - Euclidean Distance (also known as L2)
DistanceStrategy.EUCLIDEAN_SQUARED - Euclidean Squared Distance (also known as L2 Squared)

Index types

IndexType.HYPERSCALE - Specialized vector index for optimal vector search performance
IndexType.COMPOSITE - General-purpose index that can include vector and scalar fields

Advanced usage

Custom vector fields

const vectorStore = await CouchbaseQueryVectorStore.initialize(embeddings, {
  cluster,
  bucketName: "my-bucket",
  scopeName: "my-scope",
  collectionName: "my-collection",
  textKey: "content",
  embeddingKey: "vector_embedding",
  distanceStrategy: DistanceStrategy.EUCLIDEAN,
});

Creating from texts

const texts = [
  "Couchbase is a NoSQL database",
  "Vector search enables semantic similarity"
];

const metadatas = [
  { category: "database" },
  { category: "ai" }
];

const vectorStore = await CouchbaseQueryVectorStore.fromTexts(
  texts,
  metadatas,
  embeddings,
  {
    cluster,
    bucketName: "my-bucket",
    scopeName: "my-scope",
    collectionName: "my-collection"
  }
);

Deleting documents

const documentIds = ["doc1", "doc2", "doc3"];
await vectorStore.delete({ ids: documentIds });

Performance considerations

Create Indexes: Use createIndex() to create appropriate vector indexes for better performance
Choose Index Type:
- Use Hyperscale indexes for pure vector search workloads where you primarily perform similarity searches
- Use Composite indexes for mixed queries that combine vector similarity with scalar field filtering
Tune Parameters: Adjust indexScanNprobes and indexTrainlist based on your data size and performance requirements
Filter Early: Use WHERE clauses to reduce the search space before vector calculations

Error handling

try {
  await vectorStore.createIndex({
    indexType: IndexType.HYPERSCALE,
    indexDescription: "IVF,SQ8",
  });
} catch (error) {
  console.error("Index creation failed:", error.message);
}

Common errors

Insufficient training data

If you see errors related to insufficient training data, you may need to:

Increase the indexTrainlist parameter (default recommendation: ~50 vectors per centroid)
Ensure you have enough documents with vector embeddings in your collection
For collections with < 1 million vectors, use number_of_vectors / 1000 for centroids
For larger collections, use sqrt(number_of_vectors) for centroids

Comparison with `CouchbaseSearchVectorStore`

Feature	`CouchbaseQueryVectorStore`	`CouchbaseSearchVectorStore`
Service	Query (SQL++)	Search (FTS)
Index Required	Optional (for performance)	Required
Query Language	SQL++ WHERE clauses	Search query syntax
Vector Functions	APPROX_VECTOR_DISTANCE	VectorQuery API
Setup Complexity	Lower	Higher
Performance	Good with indexes	Optimized for search

Frequently asked questions

Do I need to create an index before using `CouchbaseQueryVectorStore`?

No, unlike the Search-based CouchbaseSearchVectorStore, the Query-based implementation can work without pre-created indexes. However, creating appropriate vector indexes (Hyperscale or Composite) will significantly improve query performance.

When should I use hyperscale vs. composite indexes?

Use Hyperscale indexes when you primarily perform vector similarity searches with minimal filtering on other fields
Use Composite indexes when you frequently combine vector similarity with filtering on scalar fields in the same query
Learn more about how to Choose the Right Vector Index

Can I use both `CouchbaseQueryVectorStore` and `CouchbaseSearchVectorStore` on the same data?

Yes, both can work on the same document structure. However, they use different services (Search vs Query) and have different indexing requirements.

Vector store conceptual guide
Vector store how-to guides

Edit this page on GitHub or file an issue.

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

Popular Providers

General integrations

RAG integrations

Key differences from `CouchbaseSearchVectorStore`

Installation

Create couchbase connection object

Basic setup

Creating vector indexes

Hyperscale index

Composite index

Key differences

Basic vector search example

Searching documents

Basic similarity search

Search with filters

Search with scores

Complex filtering

Configuration options

Distance strategies

Index types

Advanced usage

Custom vector fields

Creating from texts

Deleting documents

Performance considerations

Error handling

Common errors

Insufficient training data

Comparison with `CouchbaseSearchVectorStore`

Frequently asked questions

Do I need to create an index before using `CouchbaseQueryVectorStore`?

When should I use hyperscale vs. composite indexes?

Can I use both `CouchbaseQueryVectorStore` and `CouchbaseSearchVectorStore` on the same data?

Popular Providers

General integrations

RAG integrations

​Key differences from CouchbaseSearchVectorStore

​Installation

​Create couchbase connection object

​Basic setup

​Creating vector indexes

​Hyperscale index

​Composite index

​Key differences

​Basic vector search example

​Searching documents

​Basic similarity search

​Search with filters

​Search with scores

​Complex filtering

​Configuration options

​Distance strategies

​Index types

​Advanced usage

​Custom vector fields

​Creating from texts

​Deleting documents

​Performance considerations

​Error handling

​Common errors

​Insufficient training data

​Comparison with CouchbaseSearchVectorStore

​Frequently asked questions

​Do I need to create an index before using CouchbaseQueryVectorStore?

​When should I use hyperscale vs. composite indexes?

​Can I use both CouchbaseQueryVectorStore and CouchbaseSearchVectorStore on the same data?

​Related

Key differences from `CouchbaseSearchVectorStore`

Installation

Create couchbase connection object

Basic setup

Creating vector indexes

Hyperscale index

Composite index

Key differences

Basic vector search example

Searching documents

Basic similarity search

Search with filters

Search with scores

Complex filtering

Configuration options

Distance strategies

Index types

Advanced usage

Custom vector fields

Creating from texts

Deleting documents

Performance considerations

Error handling

Common errors

Insufficient training data

Comparison with `CouchbaseSearchVectorStore`

Frequently asked questions

Do I need to create an index before using `CouchbaseQueryVectorStore`?

When should I use hyperscale vs. composite indexes?

Can I use both `CouchbaseQueryVectorStore` and `CouchbaseSearchVectorStore` on the same data?

Related