langchain-weaviate
package.
Weaviate is an open-source vector database. It allows you to store data objects and vector embeddings from your favorite ML-models, and scale seamlessly into billions of data objects.To use this integration, you need to have a running Weaviate database instance.
1.23.7
or higher. However, we recommend you use the latest version of Weaviate.
http://localhost:8080
and port 50051 open for gRPC traffic. So, we will connect to Weaviate with:
Note that you require av4
client API, which will create aweaviate.WeaviateClient
object.
OpenAIEmbeddings
. We suggest obtaining an OpenAI API key and export it as an environment variable with the name OPENAI_API_KEY
.
Once this is done, your OpenAI API key will be read automatically. If you are new to environment variables, read more about them here or in this guide.
Weaviate
by loading and chunking the contents of a long text file.
weaviate_client
object. For example, we can import the documents as shown below:
k
, which is the upper limit of the number of results to return.
similarity_search
uses Weaviate’s hybrid search.
A hybrid search combines a vector and a keyword search, with alpha
as the weight of the vector search. The similarity_search
function allows you to pass additional arguments as kwargs. See this reference doc for the available arguments.
So, you can perform a pure keyword search by adding alpha=0
as shown below:
langchain-weaviate
will persist in Weaviate according to its configuration.
WCS instances, for example, are configured to persist data indefinitely, and Docker instances can be set up to persist data in a volume. Read more about Weaviate’s persistence.
tenant
parameter.
So when adding any data, provide the tenant
parameter as shown below.
tenant
parameter also.
mmr
.
RetrievalQAWithSourcesChain
, which does the lookup of the documents from an Index.
First, we will chunk the text again and import them into the Weaviate vector store.