Documentation Index
Fetch the complete documentation index at: https://docs.langchain.com/llms.txt
Use this file to discover all available pages before exploring further.
BGE models on Hugging Face are a family of open-source embedding and reranking models published by the Beijing Academy of Artificial Intelligence (BAAI). BGE was one of the leading open-source embedding families in 2023 and 2024, and while newer models on the MTEB leaderboard have since surpassed them on raw retrieval scores, BGE (and BAAI/bge-m3 in particular) remains a widely used, well-balanced default for multilingual retrieval.
LangChain provides two ways to use BGE models:
HuggingFaceEmbeddingsfromlangchain-huggingface: the generic Sentence Transformers class. Covers every BGE variant and is the recommended choice for new projects.HuggingFaceBgeEmbeddingsfromlangchain-community: a BGE-specific wrapper that automatically prepends the query instruction used by the older English v1.5 models. Convenient when you specifically want a v1.5 model and don’t want to manage the prompt yourself.
BAAI/bge-m3 and newer
BAAI/bge-m3 is trained without a query prompt, so no extra configuration is needed. normalize_embeddings=True is recommended for cosine similarity, per the model authors.
BAAI/bge-*-en-v1.5 (quick path)
The older English v1.5 models expect the query to be prefixed with an instruction. The dedicated HuggingFaceBgeEmbeddings class handles that automatically:
HuggingFaceEmbeddings:
Picking a BGE model
| Model | Size | Notes |
|---|---|---|
BAAI/bge-small-en-v1.5 | 33M | Smallest English model, CPU-friendly |
BAAI/bge-large-en-v1.5 | 335M | Stronger English, widely used baseline |
BAAI/bge-m3 | 570M | Multilingual; dense, sparse, and multi-vector in one model |
BAAI/bge-reranker-v2-m3 via the Cross Encoder Reranker guide.
More
See the Sentence Transformers integration page for GPU configuration, batch sizes, query/document prompts, and deployment options.Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

