Setup
Install the package:npm2yarn
Credentials
Get your Soniox API key from the Soniox Console and set it as an environment variable:Usage
Basic transcription
Example how to transcribe audio file using theSonioxAudioTranscriptLoader and generate the summary with an LLM.
Translation
Translate from any detected language to a target language:two_way translation type. Learn more about translation here.
Language hints
Soniox automatically detects and transcribes speech in 60+ languages. When you know which languages are likely to appear in your audio, providelanguage_hints to improve accuracy by biasing recognition toward those languages.
Language hints do not restrict recognition — they only bias the model toward the specified languages, while still allowing other languages to be detected if present.
Speaker diarization
Enable speaker identification to distinguish between different speakers:Language identification
Enable automatic language detection and identification:Context for improved accuracy
Provide domain-specific context to improve transcription accuracy:API reference
Constructor parameters
SonioxLoaderParams (required)
| Parameter | Type | Required | Description |
|---|---|---|---|
audio | Uint8Array | string | Yes | Audio file as buffer or URL |
audioFormat | SonioxAudioFormat | No | Audio file format |
apiKey | string | No | Soniox API key (defaults to SONIOX_API_KEY env var) |
apiBaseUrl | string | No | API base URL (defaults to https://api.soniox.com/v1) |
pollingIntervalMs | number | No | Polling interval in ms (min: 1000, default: 1000) |
pollingTimeoutMs | number | No | Polling timeout in ms (default: 180000) |
SonioxLoaderOptions (optional)
| Parameter | Type | Description |
|---|---|---|
model | SonioxTranscriptionModelId | Model to use (default: "stt-async-v3") |
translation | object | Translation configuration |
language_hints | string[] | Language hints for transcription |
language_hints_strict | boolean | Enforce strict language hints |
enable_speaker_diarization | boolean | Enable speaker identification |
enable_language_identification | boolean | Enable language detection |
context | object | Context for improved accuracy |
Supported audio formats
aac- Advanced Audio Codingaiff- Audio Interchange File Formatamr- Adaptive Multi-Rateasf- Advanced Systems Formatflac- Free Lossless Audio Codecmp3- MPEG Audio Layer IIIogg- Ogg Vorbiswav- Waveform Audio File Formatwebm- WebM Audio
Return value
Theload() method returns an array containing a single Document object:
SonioxTranscriptResponse type in the Soniox REST API Reference.
Related
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.