Skip to main content
The CouchbaseQueryVectorStore is the preferred implementation of Vector Search in Couchbase. It uses the Query Service (SQL++) and Index Service for vector similarity search, instead of the Search service. This provides a more powerful and straightforward approach for vector operations using SQL++ queries with vector functions. More information about Couchbase’s vector search capabilities can be found in the official documentation: Choose the Right Vector Index.
This functionality is only available in Couchbase 8.0 and above.

Key differences from CouchbaseSearchVectorStore

(formerly CouchbaseVectorStore)
  • Query and Index Service: Uses Couchbase’s Query service with SQL++ instead of the Search service
  • No Index Required: Does not require a pre-configured search index for basic operations
  • SQL++ Syntax: Supports WHERE clauses and SQL++ query syntax for filtering
  • Vector Functions: Uses APPROX_VECTOR_DISTANCE function for similarity calculations
  • Distance Strategies: Supports multiple distance strategies (Dot Product, Cosine, Euclidean, Euclidean Squared)

Installation

npm
npm install couchbase @langchain/openai @langchain/community @langchain/core

Create Couchbase connection object

We create a connection to the Couchbase cluster and then pass the cluster object to the Vector Store. Here, we are connecting using the username and password. You can also connect to your cluster using any other supported method. For more information on connecting to the Couchbase cluster, please check the Node SDK documentation.
import { Cluster } from "couchbase";

const connectionString = "couchbase://localhost";
const dbUsername = "Administrator"; // valid database user with read access to the bucket being queried
const dbPassword = "Password"; // password for the database user

const couchbaseClient = await Cluster.connect(connectionString, {
  username: dbUsername,
  password: dbPassword,
  configProfile: "wanDevelopment",
});

Basic setup

import { CouchbaseQueryVectorStore, DistanceStrategy } from "@langchain/community/vectorstores/couchbase_query";
import { OpenAIEmbeddings } from "@langchain/openai";
import { Cluster } from "couchbase";

// Connect to Couchbase
const cluster = await Cluster.connect("couchbase://localhost", {
  username: "Administrator",
  password: "password",
});

// Initialize embeddings
const embeddings = new OpenAIEmbeddings();

// Configure the vector store
const vectorStore = await CouchbaseQueryVectorStore.initialize(embeddings, {
  cluster,
  bucketName: "my-bucket",
  scopeName: "my-scope",
  collectionName: "my-collection",
  textKey: "text", // optional, defaults to "text"
  embeddingKey: "embedding", // optional, defaults to "embedding"
  distanceStrategy: DistanceStrategy.COSINE, // optional, defaults to DOT
});

Creating vector indexes

The Query vector store supports creating vector indexes to improve search performance. There are two types of indexes available:

Hyperscale index

A specialized vector index optimized for vector operations using Couchbase’s vector indexing capabilities:
import { IndexType } from "@langchain/community/vectorstores/couchbase_query";

await vectorStore.createIndex({
  indexType: IndexType.HYPERSCALE,
  indexDescription: "IVF,SQ8",
  indexName: "my_vector_index", // optional
  vectorDimension: 1536, // optional, auto-detected from embeddings
  distanceMetric: DistanceStrategy.COSINE, // optional, uses store default
  fields: ["text", "metadata"], // optional, defaults to text field
  whereClause: "type = 'document'", // optional filter
  indexScanNprobes: 10, // optional tuning parameter
  indexTrainlist: 1000, // optional tuning parameter
});
Generated SQL++:
CREATE VECTOR INDEX `my_vector_index` ON `bucket`.`scope`.`collection`
(`embedding` VECTOR) INCLUDE (`text`, `metadata`)
WHERE type = 'document' USING GSI WITH {'dimension': 1536, 'similarity': 'cosine', 'description': 'IVF,SQ8'}

Composite index

A general-purpose GSI index that includes vector fields alongside scalar fields:
await vectorStore.createIndex({
  indexType: IndexType.COMPOSITE,
  indexDescription: "IVF1024,SQ8",
  indexName: "my_composite_index",
  vectorDimension: 1536,
  fields: ["text", "metadata.category"],
  whereClause: "created_date > '2023-01-01'",
  indexScanNprobes: 3,
  indexTrainlist: 10000,
});
Generated SQL++:
CREATE INDEX `my_composite_index` ON `bucket`.`scope`.`collection`
(`text`, `metadata.category`, `embedding` VECTOR)
WHERE created_date > '2023-01-01' USING GSI
WITH {'dimension': 1536, 'similarity': 'dot', 'description': 'IVF1024,SQ8', 'scan_nprobes': 3, 'trainlist': 10000}

Key differences

AspectHyperscale IndexComposite Index
SQL++ SyntaxCREATE VECTOR INDEXCREATE INDEX
Vector Field(field VECTOR) with INCLUDE clause(field1, field2, vector_field VECTOR)
Vector ParametersSupports all vector parametersSupports all vector parameters
OptimizationSpecialized for vector operationsGeneral-purpose GSI with vector support
Use CasePure vector similarity searchMixed vector and scalar queries
PerformanceOptimized for vector distance calculationsGood for hybrid queries
Tuning ParametersSupports indexScanNprobes, indexTrainlistSupports indexScanNprobes, indexTrainlist
LimitationsOnly one vector field, uses INCLUDE for other fieldsOne vector field among multiple index keys

Basic vector search example

The following example showcases how to use Couchbase Query vector search and perform similarity search.
import { OpenAIEmbeddings } from "@langchain/openai";
import {
  CouchbaseQueryVectorStore,
  DistanceStrategy,
} from "@langchain/community/vectorstores/couchbase_query";
import { Cluster } from "couchbase";
import { Document } from "@langchain/core/documents";

const connectionString = process.env.COUCHBASE_DB_CONN_STR ?? "couchbase://localhost";
const databaseUsername = process.env.COUCHBASE_DB_USERNAME ?? "Administrator";
const databasePassword = process.env.COUCHBASE_DB_PASSWORD ?? "Password";

const couchbaseClient = await Cluster.connect(connectionString, {
  username: databaseUsername,
  password: databasePassword,
  configProfile: "wanDevelopment",
});

// OpenAI API Key is required to use OpenAIEmbeddings
const embeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY,
});

const vectorStore = await CouchbaseQueryVectorStore.initialize(embeddings, {
  cluster: couchbaseClient,
  bucketName: "testing",
  scopeName: "_default",
  collectionName: "_default",
  textKey: "text",
  embeddingKey: "embedding",
  distanceStrategy: DistanceStrategy.COSINE,
});

// Add documents
const documents = [
  new Document({
    pageContent: "Couchbase is a NoSQL database",
    metadata: { category: "database", type: "document" }
  }),
  new Document({
    pageContent: "Vector search enables semantic similarity",
    metadata: { category: "ai", type: "document" }
  })
];

await vectorStore.addDocuments(documents);

// Perform similarity search
const query = "What is a NoSQL database?";
const results = await vectorStore.similaritySearch(query, 4);
console.log("Search results:", results[0]);

// Search with scores
const resultsWithScores = await vectorStore.similaritySearchWithScore(query, 4);
console.log("Document:", resultsWithScores[0][0]);
console.log("Score:", resultsWithScores[0][1]);

Searching documents

// Basic similarity search
const results = await vectorStore.similaritySearch(
  "What is a NoSQL database?",
  4
);

Search with filters

// Search with filters
const filteredResults = await vectorStore.similaritySearch(
  "database technology",
  4,
  {
    where: "metadata.category = 'database'",
    fields: ["text", "metadata.category"]
  }
);

Search with scores

// Search with scores
const resultsWithScores = await vectorStore.similaritySearchWithScore(
  "vector search capabilities",
  4
);

Complex filtering

const results = await vectorStore.similaritySearch(
  "search query",
  10,
  {
    where: "metadata.category IN ['tech', 'science'] AND metadata.rating >= 4",
    fields: ["content", "metadata.title", "metadata.rating"]
  }
);

Configuration options

Distance strategies

Index types

  • IndexType.HYPERSCALE - Specialized vector index for optimal vector search performance
  • IndexType.COMPOSITE - General-purpose index that can include vector and scalar fields

Advanced usage

Custom vector fields

const vectorStore = await CouchbaseQueryVectorStore.initialize(embeddings, {
  cluster,
  bucketName: "my-bucket",
  scopeName: "my-scope",
  collectionName: "my-collection",
  textKey: "content",
  embeddingKey: "vector_embedding",
  distanceStrategy: DistanceStrategy.EUCLIDEAN,
});

Creating from Texts

const texts = [
  "Couchbase is a NoSQL database",
  "Vector search enables semantic similarity"
];

const metadatas = [
  { category: "database" },
  { category: "ai" }
];

const vectorStore = await CouchbaseQueryVectorStore.fromTexts(
  texts,
  metadatas,
  embeddings,
  {
    cluster,
    bucketName: "my-bucket",
    scopeName: "my-scope",
    collectionName: "my-collection"
  }
);

Deleting documents

const documentIds = ["doc1", "doc2", "doc3"];
await vectorStore.delete({ ids: documentIds });

Performance considerations

  1. Create Indexes: Use createIndex() to create appropriate vector indexes for better performance
  2. Choose Index Type:
    • Use Hyperscale indexes for pure vector search workloads where you primarily perform similarity searches
    • Use Composite indexes for mixed queries that combine vector similarity with scalar field filtering
  3. Tune Parameters: Adjust indexScanNprobes and indexTrainlist based on your data size and performance requirements
  4. Filter Early: Use WHERE clauses to reduce the search space before vector calculations

Error handling

try {
  await vectorStore.createIndex({
    indexType: IndexType.HYPERSCALE,
    indexDescription: "IVF,SQ8",
  });
} catch (error) {
  console.error("Index creation failed:", error.message);
}

Common errors

Insufficient training data

If you see errors related to insufficient training data, you may need to:
  • Increase the indexTrainlist parameter (default recommendation: ~50 vectors per centroid)
  • Ensure you have enough documents with vector embeddings in your collection
  • For collections with < 1 million vectors, use number_of_vectors / 1000 for centroids
  • For larger collections, use sqrt(number_of_vectors) for centroids

Comparison with CouchbaseSearchVectorStore

FeatureCouchbaseQueryVectorStoreCouchbaseSearchVectorStore
ServiceQuery (SQL++)Search (FTS)
Index RequiredOptional (for performance)Required
Query LanguageSQL++ WHERE clausesSearch query syntax
Vector FunctionsAPPROX_VECTOR_DISTANCEVectorQuery API
Setup ComplexityLowerHigher
PerformanceGood with indexesOptimized for search

Frequently Asked Questions

Do I need to create an index before using CouchbaseQueryVectorStore?

No, unlike the Search-based CouchbaseSearchVectorStore, the Query-based implementation can work without pre-created indexes. However, creating appropriate vector indexes (Hyperscale or Composite) will significantly improve query performance.

When should I use Hyperscale vs. Composite indexes?

  • Use Hyperscale indexes when you primarily perform vector similarity searches with minimal filtering on other fields
  • Use Composite indexes when you frequently combine vector similarity with filtering on scalar fields in the same query
  • Learn more about how to Choose the Right Vector Index

Can I use both CouchbaseQueryVectorStore and CouchbaseSearchVectorStore on the same data?

Yes, both can work on the same document structure. However, they use different services (Search vs Query) and have different indexing requirements.
Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.