UpstashRatelimitHandler
. This handler uses Upstash’s ratelimit library, which utilizes Upstash Redis.
Upstash Ratelimit works by sending an HTTP request to Upstash Redis every time the limit
method is called. Remaining tokens/requests of the user are checked and updated. Based on the remaining tokens, we can stop the execution of costly operations, like invoking an LLM or querying a vector store:
UpstashRatelimitHandler
allows you to incorporate this ratelimit logic into your chain in a few minutes.
Setup
First, you will need to go to the Upstash Console and create a redis database (see our docs). After creating a database, you will need to set the environment variables:@langchain/community
:
npm
Ratelimiting Per Request
Let’s imagine that we want to allow our users to invoke our chain 10 times per minute. Achieving this is as simple as:invoke
method instead of passing the handler when defining the chain.
For rate limiting algorithms other than FixedWindow
, see upstash-ratelimit docs.
Before executing any steps in our pipeline, ratelimit will check whether the user has passed the request limit. If so, UpstashRatelimitError
is raised.
Ratelimiting Per Token
Another option is to rate limit chain invokations based on:- number of tokens in prompt
- number of tokens in prompt and LLM completion
LLMOutput
. The format of the token usage dictionary returned depends on the LLM. To learn about how you should configure the handler depending on your LLM, see the end of the Configuration section below.
How it works
The handler will get the remaining tokens before calling the LLM. If the remaining tokens is more than 0, LLM will be called. OtherwiseUpstashRatelimitError
will be raised.
After LLM is called, token usage information will be used to subtracted from the remaining tokens of the user. No error is raised at this stage of the chain.
Configuration
For the first configuration, simply initialize the handler like this:request_ratelimit
and token_ratelimit
parameters.
For token usage to work correctly, the LLM step in LangChain.js should return a token usage field in the following format:
llmOutputTokenUsageField
, llmOutputTotalTokenField
and llmOutputPromptTokenField
by passing them to the handler: