Azure OpenAI whisper parser integration

Azure OpenAI Whisper Parser is a wrapper around the Azure OpenAI Whisper API which utilizes machine learning to transcribe audio files to english text. The Parser supports .mp3, .mp4, .mpeg, .mpga, .m4a, .wav, and .webm.

The current implementation follows LangChain core principles and can be used with other loaders to handle both audio downloading and parsing. As a result of this the parser will yield an Iterator[Document].

Prerequisites

The service requires Azure credentials, Azure endpoint and Whisper Model deployment, which can be set up by following the guide here. Furthermore, the required dependencies must be installed.

pip install -Uq  langchain langchain-community openai

Example 1

The AzureOpenAIWhisperParser’s method, .lazy_parse, accepts a Blob object as a parameter containing the file path of the file to be transcribed.

from langchain_core.documents.base import Blob

audio_path = "path/to/your/audio/file"
audio_blob = Blob(path=audio_path)

from langchain_community.document_loaders.parsers.audio import AzureOpenAIWhisperParser

endpoint = "<your_endpoint>"
key = "<your_api_key"
version = "<your_api_version>"
name = "<your_deployment_name>"

parser = AzureOpenAIWhisperParser(
    api_key=key, azure_endpoint=endpoint, api_version=version, deployment_name=name
)

documents = parser.lazy_parse(blob=audio_blob)

for doc in documents:
    print(doc.page_content)

Example 2

The AzureOpenAIWhisperParser can also be used in conjunction with audio loaders, like the YoutubeAudioLoader with a GenericLoader.

from langchain_community.document_loaders.blob_loaders.youtube_audio import (
    YoutubeAudioLoader,
)
from langchain_community.document_loaders.generic import GenericLoader

# Must be a list
url = ["www.youtube.url.com"]

save_dir = "save/directory/"

name = "<your_deployment_name>"

loader = GenericLoader(
    YoutubeAudioLoader(url, save_dir), AzureOpenAIWhisperParser(deployment_name=name)
)

docs = loader.load()

for doc in documents:
    print(doc.page_content)

Edit this page on GitHub or file an issue.

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

Popular Providers

Integrations by component

Azure OpenAI whisper parser integration

Prerequisites

Example 1

Example 2

Popular Providers

Integrations by component

​Prerequisites

​Example 1

​Example 2

Prerequisites

Example 1

Example 2