Local Embeddings
You can generate embeddings locally using theHuggingFaceEmbeddings class. This utilizes the sentence_transformers library to download the model weights and run them directly on your machine.
Let’s load the Hugging Face Embedding class.
Hugging Face Inference Endpoints (Serverless API)
If you prefer not to download models locally, you can access embedding models via the Inference Endpoints, which let us use open-source models on Hugging Face’s scalable serverless infrastructure. Ensure you have huggingface_hub installed, which is usually included with langchain-huggingfaceHuggingFaceEndpointEmbeddings class to run open-source embedding models remotely via the API.
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

