Polly is in beta. Your feedback on Polly is invaluable as the team refines its capabilities.
Polly appears in the right-hand bottom corner of the following locations within LangSmith UI:
Observability & Debugging:
- Trace pages - Analyze individual runs and execution traces
- Thread views - Understand conversation threads and user interactions
- Prompt Playground - Edit and optimize prompts
- Prompt Hub pages - Explore and understand shared prompts
- Dataset Experiments - Analyze experiment results and compare runs
- Dataset Examples - Browse and understand dataset structure
- Annotation Queues - Review runs and make informed annotation decisions
Trace pages
On an individual trace, Polly analyzes the run data and execution trajectory. Polly examines the full trace context, including run metadata, inputs, outputs, intermediate steps, and configuration to help you understand what happened and identify areas for improvement. Example questions:- “Is there anything that the agent could have done better here?”
- “Why did this run fail?”
- “What took the most time in this trace?”
- “Summarize what happened in this trace”
Thread views
Under the Threads tab, Polly analyzes conversation threads to help you understand user sentiment, conversation outcomes, and interaction patterns. Use Polly to identify user pain points and understand whether issues were resolved. Example questions:- “Did the user seem frustrated?”
- “What issues is the user experiencing?”
- “Was the user’s problem solved?”
- “What was the main topic of this thread?”
Prompt Playground
In the Playground, Polly helps you edit and optimize your prompts. Use automated options like Optimize prompt, Generate a tool, or Generate an output schema, or give Polly custom instructions for editing your prompt. Example questions:- “Make it respond in Italian”
- “Add more context about the user’s role”
- “Make the tone more professional”
- “Simplify the instructions”

Prompt Hub pages
When viewing a prompt in the LangSmith Hub, Polly helps you understand the prompt’s structure, messages, tools, and configuration. This is useful for exploring and learning from shared prompts. Example questions:- “What does this prompt do?”
- “What tools does this prompt use?”
- “Explain the structure of this prompt”
- “What are the key instructions in this prompt?”
Dataset Experiments
On the Datasets page under the Experiments tab, Polly analyzes experiment results and helps you compare runs across different experiments. Polly can identify patterns, summarize performance, and help you understand which approaches work best. Example questions:- “Which experiment performed best?”
- “What are the main differences between these runs?”
- “Summarize the results of this experiment”
- “What patterns do you see in the failures?”
Dataset Examples
On the Datasets page under the Examples tab, Polly helps you understand your dataset structure, browse examples, and identify data patterns. This is useful for understanding what data you’re working with and preparing datasets for experiments. Example questions:- “What type of data is in this dataset?”
- “Show me examples with errors”
- “What patterns do you see in the inputs?”
- “How many examples are in this dataset?”
Annotation Queues
In Annotation Queues, Polly helps you analyze runs before making annotation decisions. Whether you’re reviewing runs individually or comparing them pairwise, Polly provides insights into run behavior, errors, and execution patterns to inform your scoring. Example questions:- “What went wrong in this run?”
- “Summarize what happened in this run”
- “Compare these two runs”
- “What should I consider when scoring this?”
What’s next
Learn more about the features that Polly helps you explore:Observability
Learn more about tracing and monitoring your LLM applications
Threads
Understand how threads work in LangSmith
Prompt Engineering
Create and iterate on prompts in the playground
Evaluation
Evaluate and test your applications systematically
