SAP HANA Cloud Vector Engine is a vector store fully integrated into the SAP HANA Cloud database.
Setup
Install the@sap/hana-langchain external integration package, as well as the other packages used throughout this notebook.
Credentials
Ensure your SAP HANA instance is running. Load credentials from environment variables and create a connection using your preferred HANA client.Initialization
To initialize aHanaDB vector store, you need a database connection and an embedding instance. SAP HANA Cloud Vector Engine supports both external and internal embeddings.
Using external embeddings
Using internal embeddings
Alternatively, you can compute embeddings directly in SAP HANA using its nativeVECTOR_EMBEDDING() function. If you have internal embedding support available in your TypeScript environment, initialize and pass it to HanaDB similarly. For more information about internal embedding, see the SAP HANA VECTOR_EMBEDDING Function.
Caution: Ensure NLP is enabled in your SAP HANA Cloud instance.
HanaDB along with a table name for storing vectors:
Manage vector store
Once you have created your vector store, we can interact with it by adding and deleting different items.Add items to vector store
We can add items to our vector store by using theaddDocuments function.
Delete items from vector store
Query vector store
Query directly
Similarity search
Performing a simple similarity search with filtering on metadata can be done as follows:MMR search
Performing a Maximal Marginal Relevance (MMR) with filtering on metadata search can be done as follows:Query by turning into retriever
You can also transform the vector store into a retriever for easier usage in your chains.Distance similarity algorithm
HanaDB supports the following distance similarity algorithms:
- Cosine Similarity (default)
- Euclidian Distance (L2)
HanaDB instance by using the distanceStrategy parameter.
Creating a HNSW index
A vector index can significantly speed up top-k nearest neighbor queries for vectors. Users can create a Hierarchical Navigable Small World (HNSW) vector index using thecreateHnswIndex function.
For more information about creating an index at the database level, please refer to the official documentation.
Advanced filtering
In addition to the basic value-based filtering capabilities, it is possible to use more advanced filtering. The table below shows the available filter operators.| Operator | Semantic |
|---|---|
$eq | Equality (==) |
$ne | Inequality (!=) |
$lt | Less than (<) |
$lte | Less than or equal (<=) |
$gt | Greater than (>) |
$gte | Greater than or equal (>=) |
$in | Contained in a set of given values (in) |
$nin | Not contained in a set of given values (not in) |
$between | Between the range of two boundary values |
$like | Text equality based on the “LIKE” semantics in SQL (using ”%” as wildcard) |
$contains | Filters documents containing a specific keyword |
$and | Logical “and”, supporting two or more operands |
$or | Logical “or”, supporting two or more operands |
$ne, $gt, $gte, $lt, $lte
$between, $in, $nin
$like
$contains
$and, $or
Usage for retrieval-augmented generation
For guides on how to use this vector store for retrieval-augmented generation (RAG), see the following sections:Standard tables vs. custom tables with vector data
By default, the embeddings table contains three columns:VEC_TEXT: the document textVEC_META: the document metadataVEC_VECTOR: the embedding vector
- An
NCLOB/NVARCHARcolumn for the text/context - An
NCLOB/NVARCHARcolumn for the metadata - A
REAL_VECTORcolumn for the embedding vector
- A column with type
NCLOBorNVARCHARfor the text/context of the embeddings - A column with type
NCLOBorNVARCHARfor the metadata - A column with type
REAL_VECTORorHALF_VECTORfor the embedding vector
Filter performance optimization with custom columns
To allow flexible metadata values, all metadata is stored as JSON in the metadata column by default. If some of the used metadata keys and value types are known, they can be stored in additional columns instead by creating the target table with the key names as column names and passing them to the HanaDB constructor via thespecificMetadataColumns list. Metadata keys that match those values are copied into the special column during insert. Filters use the special columns instead of the metadata JSON column for keys in the specificMetadataColumns list.
A simple example
Load the sample document “state_of_the_union.txt” and create chunks from it. First, install the@langchain/textsplitters package:
Maximal marginal relevance search (MMR)
Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents. The first 20 (fetch_k) items will be retrieved from the DB. The MMR algorithm will then find the best 2 (k) matches.
Creating a HNSW vector index
A vector index can significantly speed up top-k nearest neighbor queries for vectors. Users can create a Hierarchical Navigable Small World (HNSW) vector index using thecreateHnswIndex function.
For more information about creating an index at the database level, please refer to the official documentation.
- Similarity Function: The similarity function for the index is cosine similarity by default. If you want to use a different similarity function (e.g.,
L2distance), you need to specify it when initializing theHanaDBinstance. - Default Parameters: In the
createHnswIndexfunction, if the user does not provide custom values for parameters likem,efConstruction, orefSearch, the default values (e.g.,m=64,efConstruction=128,efSearch=200) will be used automatically. These values ensure the index is created with reasonable performance without requiring user intervention.
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.