Milvus is a database that stores, indexes, and manages massive embedding vectors generated by deep neural networks and other machine learning (ML) models.This notebook shows how to use functionality related to the Milvus vector database.
langchain-milvus
with pip install -qU langchain-milvus
to use this integration.
Milvus
vector store.
If you want to use Zilliz Cloud, the fully managed cloud service for Milvus, please adjust the uri and token, which correspond to the Public Endpoint and Api key in Zilliz Cloud.
add_documents
function.
Milvus
vector store, you can visit the API reference.
builtin_function
parameter. Through this parameter, you can pass in an instance of the BM25BuiltInFunction
. This is different than semantic search which usually passes dense embeddings to the VectorStore
,
Here is a simple example of hybrid search in Milvus with OpenAI dense embedding for semantic search and BM25 for full-text search:
In the code above, we define an instance of
- When you use
BM25BuiltInFunction
, please note that the full-text search is available in Milvus Standalone and Milvus Distributed, but not in Milvus Lite, although it is on the roadmap for future inclusion. It will also be available in Zilliz Cloud (fully-managed Milvus) soon. Please reach out to support@zilliz.com for more information.
BM25BuiltInFunction
and pass it to the Milvus
object. BM25BuiltInFunction
is a lightweight wrapper class for Function
in Milvus. We can use it with OpenAIEmbeddings
to initialize a dense + sparse hybrid search Milvus vector store instance.
BM25BuiltInFunction
does not require the client to pass corpus or training, all are automatically processed at the Milvus server’s end, so users do not need to care about any vocabulary and corpus. In addition, users can also customize the analyzer to implement the custom text processing in the BM25.
The Partition key feature is not available in Milvus Lite, if you want to use it, you need to start Milvus server, as mentioned above.
search_kwargs={"expr": '<partition_key> == "xxxx"'}
search_kwargs={"expr": '<partition_key> == in ["xxx", "xxx"]'}
Do replace <partition_key>
with the name of the field that is designated as the partition key.
Milvus changes to a partition based on the specified partition key, filters entities according to the partition key, and searches among the filtered entities.