Azure Machine Learning is a platform used to build, train, and deploy machine learning models. Users can explore the types of models to deploy in the Model Catalog, which provides foundational and general purpose models from different providers. In general, you need to deploy models in order to consume its predictions (inference). InThis notebook goes over how to use a chat model hosted on anAzure Machine Learning
, Online Endpoints are used to deploy these models with a real-time serving. They are based on the ideas ofEndpoints
andDeployments
which allow you to decouple the interface of your production workload from the implementation that serves it.
Azure Machine Learning Endpoint
.
endpoint_url
: The REST endpoint url provided by the endpoint.endpoint_api_type
: Use endpoint_type='dedicated'
when deploying models to Dedicated endpoints (hosted managed infrastructure). Use endpoint_type='serverless'
when deploying models using the Pay-as-you-go offering (model as a service).endpoint_api_key
: The API key provided by the endpointcontent_formatter
parameter is a handler class for transforming the request and response of an AzureML endpoint to match with required schema. Since there are a wide range of models in the model catalog, each of which may process data differently from one another, a ContentFormatterBase
class is provided to allow users to transform data to their liking. The following content formatters are provided:
CustomOpenAIChatContentFormatter
: Formats request and response data for models like LLaMa2-chat that follow the OpenAI API spec for request and response.langchain.chat_models.azureml_endpoint.LlamaChatContentFormatter
is being deprecated and replaced with langchain.chat_models.azureml_endpoint.CustomOpenAIChatContentFormatter
.
You can implement custom content formatters specific for your model deriving from the class langchain_community.llms.azureml_endpoint.ContentFormatterBase
.
model_kwargs
argument: