langchain-plainid, you can:
- Filter RAG data: Dynamically filter documents retrieved from your vector store based on the user’s permissions, ensuring they only see data they are authorized to access.
- Authorize prompts: Control whether a user or tenant is allowed to invoke a chain or tool based on the category of their query.
- Anonymize data: Detect and anonymize (mask or encrypt) PII or other sensitive entities in responses, based on policies defined in PlainID.
Installation
First, install the partner package:Setup
Next, you need to configure the provider with credentials from your PlainID tenant. You will need your Client ID, Client Secret, and Base URL. You can set these as environment variables:Usage
The package provides three main components for enforcing authorization.RAG Data Filtering
ThePlainIDRetriever wraps your existing vector store retriever. It fetches authorization filters from PlainID based on the user’s identity and applies them to the vector store query. This filters out documents before they are passed to the LLM for context.
This example assumes you have a PlainIDPermissionsProvider configured (e.g., via environment variables) and a PlainIDRetrieverFilterProvider set up.
Prompt Authorization
ThePlainIDCategorizer can be placed at the beginning of a chain to authorize the user’s intent. It classifies the input prompt (e.g., “HR”, “Finance”, “Contract”) and checks with PlainID if the user is permitted to ask about that category. If not authorized, it raises a ValueError.
PII Anonymization
ThePlainIDAnonymizer can be placed at the end of a chain to inspect the LLM’s response. It uses presidio to detect PII entities (like “PERSON”, “PHONE_NUMBER”) and then consults PlainID on whether to MASK or ENCRYPT them based on defined policies.
For more detailed information and full examples, refer to the langchain_plainid PyPI page.