Azure Cosmos DB for NoSQL provides support for querying items with flexible schemas and native support for JSON. It now offers vector indexing and search. This feature is designed to handle high-dimensional vectors, enabling efficient and accurate vector search at any scale. You can now store vectors directly in the documents alongside your data. Each document in your database can contain not only traditional schema-free data, but also high-dimensional vectors as other properties of the documents.Learn how to leverage the vector search capabilities of Azure Cosmos DB for NoSQL from this page. If you don’t have an Azure account, you can create a free account to get started.
Setup
You’ll first need to install the@langchain/azure-cosmosdb
package:
npm
.env example
Using Azure Managed Identity
If you’re using Azure Managed Identity, you can configure the credentials like this:When using Azure Managed Identity and role-based access control, you must ensure that the database and container have been created beforehand. RBAC does not provide permissions to create databases and containers. You can get more information about the permission model in the Azure Cosmos DB documentation.
Security considerations when using filters
Using filters with user-provided input can be a security risk if the data is not sanitized properly. Follow the recommendation below to prevent potential security issues.
WHERE ${userFilter}
- introduces a critical risk of SQL injection attacks, potentially exposing unintended data or compromising your system’s integrity. To mitigate this, always use Azure Cosmos DB’s parameterized query mechanism, passing in @param
placeholders, which cleanly separates the query logic from user-provided input.
Here is an example of unsafe code:
123 OR 1=1
, then the query becomes SELECT * FROM c WHERE c.metadata.userId = '123' OR 1=1
, which forces the condition to always be true, causing it to bypass the intended filter and delete all documents.
To prevent this injection risk, you define a placeholder like @userId
and Cosmos DB binds the user input separately as a parameter, ensuring it is treated strictly as data and not executable query logic as shown below.
123 OR 1=1
, the input will be treated as a literal string value to match, and not as part of the query structure.
Please refer to the official documentation on parameterized queries in Azure Cosmos DB for NoSQL for more usage examples and best practices.
Usage example
Below is an example that indexes documents from a file in Azure Cosmos DB for NoSQL, runs a vector search query, and finally uses a chain to answer a question in natural language based on the retrieved documents.Related
- Vector store conceptual guide
- Vector store how-to guides