SageMaker Endpoints Embeddings class. The class can be used if you host, e.g. your own Hugging Face model on SageMaker.
For instructions on how to do this, see custom inference with Hugging Face on SageMaker.
Note: In order to handle batched requests, you will need to adjust the return line in the predict_fn() function within the custom inference.py script:
Change from
return {"vectors": sentence_embeddings[0].tolist()}
to:
return {"vectors": sentence_embeddings.tolist()}.
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

