OpenVINOEmbeddings integration

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. The OpenVINO™ Runtime supports various hardware devices including x86 and ARM CPUs, and Intel GPUs. It can help to boost deep learning performance in Computer Vision, Automatic Speech Recognition, Natural Language Processing and other common tasks. Hugging Face embedding model can be supported by OpenVINO through OpenVINOEmbeddings class. If you have an Intel GPU, you can specify model_kwargs={"device": "GPU"} to run inference on it.

pip install -U-strategy eager "optimum[openvino,nncf]" --quiet

from langchain_community.embeddings import OpenVINOEmbeddings

model_name = "sentence-transformers/all-mpnet-base-v2"
model_kwargs = {"device": "CPU"}
encode_kwargs = {"mean_pooling": True, "normalize_embeddings": True}

ov_embeddings = OpenVINOEmbeddings(
    model_name_or_path=model_name,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs,
)

text = "This is a test document."

query_result = ov_embeddings.embed_query(text)

query_result[:3]

[-0.048951778560876846, -0.03986183926463127, -0.02156277745962143]

doc_result = ov_embeddings.embed_documents([text])

Export IR model

It is possible to export your embedding model to the OpenVINO IR format with OVModelForFeatureExtraction, and load the model from local folder.

from pathlib import Path

ov_model_dir = "all-mpnet-base-v2-ov"
if not Path(ov_model_dir).exists():
    ov_embeddings.save_model(ov_model_dir)

ov_embeddings = OpenVINOEmbeddings(
    model_name_or_path=ov_model_dir,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs,
)

Compiling the model to CPU ...

BGE with OpenVINO

We can also access BGE embedding models via the OpenVINOBgeEmbeddings class with OpenVINO.

from langchain_community.embeddings import OpenVINOBgeEmbeddings

model_name = "BAAI/bge-small-en"
model_kwargs = {"device": "CPU"}
encode_kwargs = {"normalize_embeddings": True}
ov_embeddings = OpenVINOBgeEmbeddings(
    model_name_or_path=model_name,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs,
)

embedding = ov_embeddings.embed_query("hi this is harrison")
len(embedding)

For more information refer to:

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

Edit this page on GitHub or file an issue.

Popular Providers

Integrations by component

OpenVINOEmbeddings integration

Export IR model

BGE with OpenVINO

Popular Providers

Integrations by component

Documentation Index

​Export IR model

​BGE with OpenVINO

Export IR model

BGE with OpenVINO