ChatNVIDIA
features and configurations head to the API reference.
langchain-nvidia-ai-endpoints
package contains LangChain integrations building applications with models on
NVIDIA NIM inference microservice. NIM supports models across domains like chat, embedding, and re-ranking models
from the community as well as NVIDIA. These models are optimized by NVIDIA to deliver the best performance on NVIDIA
accelerated infrastructure and deployed as a NIM, an easy-to-use, prebuilt containers that deploy anywhere using a single
command on NVIDIA accelerated infrastructure.
NVIDIA hosted deployments of NIMs are available to test on the NVIDIA API catalog. After testing,
NIMs can be exported from NVIDIA’s API catalog using the NVIDIA AI Enterprise license and run on-premises or in the cloud,
giving enterprises ownership and full control of their IP and AI application.
NIMs are packaged as container images on a per model basis and are distributed as NGC container images through the NVIDIA NGC Catalog.
At their core, NIMs provide easy, consistent, and familiar APIs for running inference on an AI model.
This example goes over how to use LangChain to interact with NVIDIA supported via the ChatNVIDIA
class.
For more information on accessing the chat models through this api, check out the ChatNVIDIA documentation.
Class | Package | Local | Serializable | JS support | Package downloads | Package latest |
---|---|---|---|---|---|---|
ChatNVIDIA | langchain-nvidia-ai-endpoints | ✅ | beta | ❌ |
Tool calling | Structured output | JSON mode | Image input | Audio input | Video input | Token-level streaming | Native async | Token usage | Logprobs |
---|---|---|---|---|---|---|---|---|---|
✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ |
Input
select the Python
tab, and click Get API Key
. Then click Generate Key
.
NVIDIA_API_KEY
. From there, you should have access to the endpoints.
langchain-nvidia-ai-endpoints
package:
available_models
will still give you all of the other models offered by your API credentials.
The playground_
prefix is optional.
ChatNVIDIA
.
Some model types support unique prompting techniques and chat messages. We will review a few important ones below.
To find out more about a specific model, please navigate to the API section of an AI Foundation model as linked here.
meta/llama3-8b-instruct
and mistralai/mixtral-8x22b-instruct-v0.1
are good all-around models that you can use for with any LangChain chat messages. Example below.
meta/codellama-70b
.
nvidia/neva-22b
.
Below is an example use:
<img/>
HTML tags. While this isn’t interoperable with other LLMs, you can directly prompt the model accordingly.
ConversationChain
. Below, we show the LangChain RunnableWithMessageHistory example applied to the mistralai/mixtral-8x22b-instruct-v0.1
model.
ChatNVIDIA
supports bind_tools.
ChatNVIDIA
provides integration with the variety of models on build.nvidia.com as well as local NIMs. Not all these models are trained for tool calling. Be sure to select a model that does have tool calling for your experimention and applications.
You can get a list of models that are known to support tool calling with,
ChatNVIDIA
features and configurations head to the API reference: https://python.langchain.com/api_reference/nvidia_ai_endpoints/chat_models/langchain_nvidia_ai_endpoints.chat_models.ChatNVIDIA.html