Skip to main content
LangSmith can capture traces generated by Pipecat using OpenTelemetry instrumentation. This guide shows you how to automatically capture traces from your Pipecat voice AI pipelines and send them to LangSmith for monitoring and analysis. For our high-level guiding principles on tracing voice agents, see Voice tracing fundamentals. For a complete implementation, see the voice demo repository.

Installation

Install the required packages:
pip install langsmith "pipecat-ai[whisper,openai,local]" opentelemetry-exporter-otlp python-dotenv
If you plan to use the advanced audio recording features, also install: pip install scipy numpy

Quickstart tutorial

Follow this step-by-step tutorial to create a voice AI agent with Pipecat and LangSmith tracing. You’ll build a complete working example by copying and pasting code snippets.

Step 1: Set up your environment

Create a .env file in your project directory:
.env
OTEL_EXPORTER_OTLP_ENDPOINT=https://api.smith.langchain.com/otel
OTEL_EXPORTER_OTLP_HEADERS=x-api-key=<your-langsmith-api-key>, Langsmith-Project=pipecat-voice
OPENAI_API_KEY=<your-openai-api-key>

Step 2: Download the span processor

Pipecat emits OpenTelemetry spans, but its attribute names aren’t ones LangSmith recognizes by default. A custom span processor translates those attributes so your traces render properly in LangSmith. Add the custom span processor file and save it as langsmith_processor.py in your project directory.
The span processor enriches Pipecat’s OpenTelemetry spans with LangSmith-compatible attributes so your traces display properly in LangSmith.Key functions:
  • Converts Pipecat span types (stt, llm, tts, turn, conversation) to LangSmith format.
  • Adds gen_ai.prompt.* and gen_ai.completion.* attributes for message visualization.
  • Renders the whole-conversation transcript onto the root run.
  • Handles audio file attachments (for advanced usage).
It only reshapes spans it recognizes as Pipecat’s; any other span (for example, a nested LangChain or LangGraph run) passes through untouched. The processor activates when you import it in your code.

Step 3: Create your voice agent file

Create a new file called agent.py and add the following code. We’ll build it section by section so you can copy and paste each part.

Part 1: Import dependencies

import asyncio
import uuid
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Import Pipecat components
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.services.whisper.stt import WhisperSTTService
from pipecat.services.openai import OpenAILLMService, OpenAITTSService
from pipecat.transports.local.audio import LocalAudioTransport, LocalAudioTransportParams

# Import the span processor setup to enable LangSmith tracing
from langsmith_processor import setup_langsmith_tracing

Part 2: Define the main function

async def main():
    # Generate unique conversation ID for LangSmith
    conversation_id = str(uuid.uuid4())
    print(f"Starting conversation: {conversation_id}")

    # Configure OpenTelemetry export to LangSmith and register the span processor.
    # This reads OTEL_EXPORTER_OTLP_ENDPOINT / OTEL_EXPORTER_OTLP_HEADERS from your
    # environment and returns the processor so you can register a recording later.
    span_processor = setup_langsmith_tracing()

    # Configure audio input/output with voice activity detection
    transport = LocalAudioTransport(
        LocalAudioTransportParams(
            audio_in_enabled=True,
            audio_out_enabled=True,
            vad_analyzer=SileroVADAnalyzer(),
        )
    )

    # Initialize AI services
    stt = WhisperSTTService()
    llm = OpenAILLMService(model="gpt-5.4-mini")
    tts = OpenAITTSService(voice="alloy")

    # Set up conversation context with system prompt
    context = OpenAILLMContext(
        messages=[
            {
                "role": "system",
                "content": "You are a helpful voice assistant. Keep responses concise and conversational."
            }
        ]
    )
    context_aggregator = llm.create_context_aggregator(context)

    # Build the processing pipeline
    pipeline = Pipeline([
        transport.input(),           # Capture microphone input
        stt,                         # Convert speech to text
        context_aggregator.user(),   # Add user message to context
        llm,                         # Generate AI response
        tts,                         # Convert response to speech
        transport.output(),          # Play through speakers
        context_aggregator.assistant(),  # Add assistant response to context
    ])

    # Create task with tracing enabled
    task = PipelineTask(
        pipeline,
        params=PipelineParams(enable_metrics=True),
        enable_tracing=True,
        enable_turn_tracking=True,
        conversation_id=conversation_id,
    )

    # Run the agent
    runner = PipelineRunner()
    await runner.run(task)

Part 3: Add the entry point

if __name__ == "__main__":
    asyncio.run(main())

Step 4: Run your agent

Run your voice agent:
python agent.py
Speak to the agent through your microphone. All traces will automatically appear in LangSmith. View the complete agent.py code.

Advanced usage

Trace a nested LangGraph agent

Instead of a stock LLM service, you can use an in-process LangChain or LangGraph agent as the LLM stage of your pipeline. With a few adjustments, the agent’s model and tool runs nest inside Pipecat’s llm span so the whole conversation stays a single trace. Three things make this work:
  1. Set LANGSMITH_TRACING_MODE=otel. This makes the LangSmith SDK emit your LangChain/LangGraph runs as OpenTelemetry spans through the same provider, so they nest under Pipecat’s llm span instead of forming separate top-level traces.
  2. Use a traced LLM service. A bare FrameProcessor produces no llm span for the graph’s runs to nest under. Subclass a traced service such as OpenAILLMService and run your graph from the traced context handler so its runs land inside the llm span.
  3. Set llm_span_kind="chain" on the span processor. With the graph nested inside, Pipecat’s llm span no longer performs the inference itself: the graph’s own model nodes are the real LLM runs. Marking the wrapper a chain avoids an LLM run nested inside another LLM run.
The resulting trace looks like this:
conversation                        ← root: whole transcript + audio recording
└── turn × N
    ├── stt
    ├── llm                         ← chain (orchestrates the graph)
    │   ├── model                   ← ChatOpenAI (may emit tool calls)
    │   ├── tools: lookup_weather   ← tool run
    │   └── model                   ← final answer (spoken)
    └── tts
Only the final answer is spoken, but the tool-call exchange must be persisted back into Pipecat’s LLMContext (as OpenAI-format messages) or the model loses its tool history on later turns.
For a complete working implementation, see the voice demo repository.

Custom metadata and tags

You can add custom metadata to your traces using span attributes:
from opentelemetry import trace

tracer = trace.get_tracer(__name__)

async def run_voice_session():
    with tracer.start_as_current_span("voice_conversation") as span:
        # Add custom metadata
        span.set_attribute("langsmith.metadata.session_type", "voice_assistant")
        span.set_attribute("langsmith.metadata.user_id", "user_123")
        span.set_attribute("langsmith.span.tags", "pipecat,voice-ai,stt-llm-tts")

        # Your Pipecat pipeline code here
        task = PipelineTask(pipeline, enable_tracing=True)
        await task.queue_frames([TextFrame("Hello")])

Recording and attaching audio to traces

Record the conversation and attach the audio to the root run so you can listen to it alongside the transcript. For the underlying attachment API, see Upload files with traces. Record what was heard, not what was generated. The naive approach, adding a recording FrameProcessor to the pipeline, taps the TTS frames upstream of transport.output(), so it over-captures: it includes audio the user never heard when a barge-in truncates the agent mid-sentence. Instead, tap at the device-write boundary, which the output transport reaches only after interruption truncation. The demo’s RecordingLocalAudioTransport does this: it records played agent audio in write_audio_frame and user audio off the input callback, then writes one stereo WAV (left channel user, right channel agent) using the shared build_stereo_session_wav helper. Use these as a reference implementation and adapt them to your project. Swap LocalAudioTransport for the recording transport, register the recording with the span processor so it attaches when the conversation span ends, and save it in a finally block:
from pathlib import Path
from datetime import datetime
from recording_transport import ConversationRecorder, RecordingLocalAudioTransport

# Write the conversation recording to a per-run path
recordings_dir = Path.cwd() / "pipecat-recordings"
recordings_dir.mkdir(exist_ok=True)
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
recording_path = recordings_dir / f"conversation_{timestamp}.wav"

# Tap played agent audio and the user's mic at the device boundary
recorder = ConversationRecorder(recording_path)
transport = RecordingLocalAudioTransport(
    LocalAudioTransportParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
        vad_analyzer=SileroVADAnalyzer(),
    ),
    recorder,
)

# Attach the WAV to the conversation root span when the span ends
span_processor.register_recording(
    conversation_id, str(recording_path), audio_recorder=recorder
)

# Build the pipeline with the recording transport
pipeline = Pipeline([
    transport.input(),               # mic in (taps user audio)
    stt,
    context_aggregator.user(),
    llm,
    tts,
    transport.output(),              # speaker out (taps played agent audio)
    context_aggregator.assistant(),
])

# Run the pipeline, saving the recording on the way out
runner = PipelineRunner()
try:
    await runner.run(task)
finally:
    recorder.save_recording()
register_recording accepts any object with a save_recording() method as audio_recorder. The span processor calls it when the conversation span ends, so the file is on disk before it is read and attached. The finally block covers shutdown ordering and the case where tracing is disabled.

Troubleshooting

Spans not appearing in LangSmith

If traces aren’t showing up in LangSmith:
  1. Verify environment variables: Ensure OTEL_EXPORTER_OTLP_ENDPOINT and OTEL_EXPORTER_OTLP_HEADERS are set correctly in your .env file.
  2. Check API key: Confirm your LangSmith API key has write permissions.
  3. Verify import: Make sure you’re importing setup_langsmith_tracing from langsmith_processor.py and calling it before running the pipeline.
  4. Check .env loading: Ensure load_dotenv() is called before importing Pipecat components.

Messages not showing correctly

If conversation messages aren’t displaying properly:
  1. Check span processor: Verify langsmith_processor.py is in your project directory and imported correctly.
  2. Verify conversation ID: Ensure you’re setting a unique conversation_id in PipelineTask.
  3. Enable turn tracking: Make sure enable_turn_tracking=True is set in PipelineTask.

Audio not working

If your microphone or speakers aren’t working:
  1. Check permissions: Ensure your terminal/IDE has microphone access.
  2. Test audio devices: Verify your microphone and speakers work in other applications.
  3. VAD settings: Try adjusting SileroVADAnalyzer() settings if speech isn’t being detected.
  4. Check services: Ensure OpenAI API key is valid and has access to Whisper and TTS.

Import errors

If you’re getting import errors:
  1. Install dependencies: Run pip install langsmith "pipecat-ai[whisper,openai,local]" opentelemetry-exporter-otlp python-dotenv.
  2. Check Python version: Ensure you’re using Python 3.9 or higher.
  3. Verify langsmith_processor: Make sure langsmith_processor.py is downloaded and in the same directory as your agent.py.

Performance issues

If responses are slow:
  1. Use faster models: Switch to gpt-5.4-mini for the LLM (already in the tutorial).
  2. Check network: Ensure stable internet connection for API calls.
  3. Local STT: Consider using local Whisper instead of API-based services.

Advanced: Audio recording troubleshooting

For issues with the advanced audio recording features, see the complete demo documentation.