UpstashRatelimitHandler
. This handler uses Upstash’s ratelimit library, which utilizes Upstash Redis.
Upstash Ratelimit works by sending an HTTP request to Upstash Redis every time the limit
method is called. Remaining tokens/requests of the user are checked and updated. Based on the remaining tokens, we can stop the execution of costly operations, like invoking an LLM or querying a vector store:
UpstashRatelimitHandler
allows you to incorporate this ratelimit logic into your chain in a few minutes.
@langchain/community
:
invoke
method instead of passing the handler when defining the chain.
For rate limiting algorithms other than FixedWindow
, see upstash-ratelimit docs.
Before executing any steps in our pipeline, ratelimit will check whether the user has passed the request limit. If so, UpstashRatelimitError
is raised.
LLMOutput
. The format of the token usage dictionary returned depends on the LLM. To learn about how you should configure the handler depending on your LLM, see the end of the Configuration section below.
UpstashRatelimitError
will be raised.
After LLM is called, token usage information will be used to subtracted from the remaining tokens of the user. No error is raised at this stage of the chain.
request_ratelimit
and token_ratelimit
parameters.
For token usage to work correctly, the LLM step in LangChain.js should return a token usage field in the following format:
llmOutputTokenUsageField
, llmOutputTotalTokenField
and llmOutputPromptTokenField
by passing them to the handler: