DeepInfra is a serverless inference as a service that provides access to a variety of LLMs and embeddings models. This notebook goes over how to use LangChain with DeepInfra for chat models.
Make sure to get your API key from DeepInfra. You have to Login and get a new token.You are given a 1 hour free of serverless GPU compute to test different models. (see here)
You can print your token with deepctl auth token
Copy
Ask AI
# get a new token: https://deepinfra.com/login?from=%2Fdashimport osfrom getpass import getpassfrom langchain_community.chat_models import ChatDeepInfrafrom langchain_core.messages import HumanMessageDEEPINFRA_API_TOKEN = getpass()# or pass deepinfra_api_token parameter to the ChatDeepInfra constructoros.environ["DEEPINFRA_API_TOKEN"] = DEEPINFRA_API_TOKENchat = ChatDeepInfra(model="meta-llama/Llama-2-7b-chat-hf")messages = [ HumanMessage( content="Translate this sentence from English to French. I love programming." )]chat.invoke(messages)
DeepInfra currently supports only invoke and async invoke tool calling.For a complete list of models that support tool calling, please refer to our tool calling documentation.