Activeloop Deep Memory is a suite of tools that enables you to optimize your Vector Store for your use-case and achieve higher accuracy in your LLM apps.
Retrieval-Augmented Generatation
(RAG
) has recently gained significant attention. As advanced RAG techniques and agents emerge, they expand the potential of what RAGs can accomplish. However, several challenges may limit the integration of RAGs into production. The primary factors to consider when implementing RAGs in production settings are accuracy (recall), cost, and latency. For basic use cases, OpenAI’s Ada model paired with a naive similarity search can produce satisfactory results. Yet, for higher accuracy or recall during searches, one might need to employ advanced retrieval techniques. These methods might involve varying data chunk sizes, rewriting queries multiple times, and more, potentially increasing latency and costs. Activeloop’s Deep Memory a feature available to Activeloop Deep Lake
users, addresses these issuea by introducing a tiny neural network layer trained to match user queries with relevant data from a corpus. While this addition incurs minimal latency during search, it can boost retrieval accuracy by up to 27
% and remains cost-effective and simple to use, without requiring any additional advanced rag techniques.
For this tutorial we will parse DeepLake
documentation, and create a RAG system that could answer the question from the docs.
BeautifulSoup
library and LangChain’s document parsers like Html2TextTransformer
, AsyncHtmlLoader
. So we will need to install the following libraries:
BeautifulSoup
questions
- is a text of strings, where each string represents a queryrelevance
- contains links to the ground truth for each question. There might be several docs that contain answer to the given question. Because of this relevenve is List[List[tuple[str, float]]]
, where outer list represents queries and inner list relevant documents. Tuple contains str, float pair where string represent the id of the source doc (corresponds to the id
tensor in the dataset), while float corresponds to how much current document is related to the question.recall
metrics.
It can be done easily in a few lines of code.