
From an existing run
First, ensure you have properly traced a multi-turn conversation, and then navigate to your tracing project. Once you get to your tracing project simply open the run, select the LLM call, and open it in the playground as follows:
From a dataset
Before starting, make sure you have set up your dataset. Since you want to evaluate multi-turn conversations, make sure there is a key in your inputs that contains a list of messages. Once you have created your dataset, head to the playground and load your dataset to evaluate. Then, add a messages list variable to your prompt, making sure to name it the same as the key in your inputs that contains the list of messages:
Manually
There are two ways to manually create multi-turn conversations. The first way is by simply appending messages to the prompt:

Messages List
variable, allowing you to reuse this prompt across various runs.
Next Steps
Now that you know how to set up the playground for multi-turn interactions, you can either manually inspect and judge the outputs, or you can add evaluators to classify results. You can also read these how-to guides to learn more about how to use the playground to run evaluations.Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.