CouchbaseQueryVectorStore is the preferred implementation of Vector Search in Couchbase. It uses the Query Service (SQL++) and Index Service for vector similarity search, instead of the Search service. This provides a more powerful and straightforward approach for vector operations using SQL++ queries with vector functions.
More information about Couchbase’s vector search capabilities can be found in the official documentation: Choose the Right Vector Index.
This functionality is only available in Couchbase 8.0 and above.
Key differences from CouchbaseSearchVectorStore
(formerly CouchbaseVectorStore)
- Query and Index Service: Uses Couchbase’s Query service with SQL++ instead of the Search service
- No Index Required: Does not require a pre-configured search index for basic operations
- SQL++ Syntax: Supports WHERE clauses and SQL++ query syntax for filtering
- Vector Functions: Uses
APPROX_VECTOR_DISTANCEfunction for similarity calculations - Distance Strategies: Supports multiple distance strategies (Dot Product, Cosine, Euclidean, Euclidean Squared)
Installation
npm
Create Couchbase connection object
We create a connection to the Couchbase cluster and then pass the cluster object to the Vector Store. Here, we are connecting using the username and password. You can also connect to your cluster using any other supported method. For more information on connecting to the Couchbase cluster, please check the Node SDK documentation.Basic setup
Creating vector indexes
The Query vector store supports creating vector indexes to improve search performance. There are two types of indexes available:Hyperscale index
A specialized vector index optimized for vector operations using Couchbase’s vector indexing capabilities:Composite index
A general-purpose GSI index that includes vector fields alongside scalar fields:Key differences
| Aspect | Hyperscale Index | Composite Index |
|---|---|---|
| SQL++ Syntax | CREATE VECTOR INDEX | CREATE INDEX |
| Vector Field | (field VECTOR) with INCLUDE clause | (field1, field2, vector_field VECTOR) |
| Vector Parameters | Supports all vector parameters | Supports all vector parameters |
| Optimization | Specialized for vector operations | General-purpose GSI with vector support |
| Use Case | Pure vector similarity search | Mixed vector and scalar queries |
| Performance | Optimized for vector distance calculations | Good for hybrid queries |
| Tuning Parameters | Supports indexScanNprobes, indexTrainlist | Supports indexScanNprobes, indexTrainlist |
| Limitations | Only one vector field, uses INCLUDE for other fields | One vector field among multiple index keys |
Basic vector search example
The following example showcases how to use Couchbase Query vector search and perform similarity search.Searching documents
Basic similarity search
Search with filters
Search with scores
Complex filtering
Configuration options
Distance strategies
DistanceStrategy.DOT- Dot Product (default)DistanceStrategy.COSINE- Cosine SimilarityDistanceStrategy.EUCLIDEAN- Euclidean Distance (also known as L2)DistanceStrategy.EUCLIDEAN_SQUARED- Euclidean Squared Distance (also known as L2 Squared)
Index types
IndexType.HYPERSCALE- Specialized vector index for optimal vector search performanceIndexType.COMPOSITE- General-purpose index that can include vector and scalar fields
Advanced usage
Custom vector fields
Creating from Texts
Deleting documents
Performance considerations
- Create Indexes: Use
createIndex()to create appropriate vector indexes for better performance - Choose Index Type:
- Use Hyperscale indexes for pure vector search workloads where you primarily perform similarity searches
- Use Composite indexes for mixed queries that combine vector similarity with scalar field filtering
- Tune Parameters: Adjust
indexScanNprobesandindexTrainlistbased on your data size and performance requirements - Filter Early: Use WHERE clauses to reduce the search space before vector calculations
Error handling
Common errors
Insufficient training data
If you see errors related to insufficient training data, you may need to:- Increase the
indexTrainlistparameter (default recommendation: ~50 vectors per centroid) - Ensure you have enough documents with vector embeddings in your collection
- For collections with < 1 million vectors, use
number_of_vectors / 1000for centroids - For larger collections, use
sqrt(number_of_vectors)for centroids
Comparison with CouchbaseSearchVectorStore
| Feature | CouchbaseQueryVectorStore | CouchbaseSearchVectorStore |
|---|---|---|
| Service | Query (SQL++) | Search (FTS) |
| Index Required | Optional (for performance) | Required |
| Query Language | SQL++ WHERE clauses | Search query syntax |
| Vector Functions | APPROX_VECTOR_DISTANCE | VectorQuery API |
| Setup Complexity | Lower | Higher |
| Performance | Good with indexes | Optimized for search |
Frequently Asked Questions
Do I need to create an index before using CouchbaseQueryVectorStore?
No, unlike the Search-based CouchbaseSearchVectorStore, the Query-based implementation can work without pre-created indexes. However, creating appropriate vector indexes (Hyperscale or Composite) will significantly improve query performance.
When should I use Hyperscale vs. Composite indexes?
- Use Hyperscale indexes when you primarily perform vector similarity searches with minimal filtering on other fields
- Use Composite indexes when you frequently combine vector similarity with filtering on scalar fields in the same query
- Learn more about how to Choose the Right Vector Index
Can I use both CouchbaseQueryVectorStore and CouchbaseSearchVectorStore on the same data?
Yes, both can work on the same document structure. However, they use different services (Search vs Query) and have different indexing requirements.
Related
- Vector store conceptual guide
- Vector store how-to guides
Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.