How to retry failed runs in experiments (Python only)

Step 1. Run the initial evaluation
Step 2. Retry on failed examples and log to same experiment
Related topics

When running evaluations on large datasets, you may encounter failures on a small subset of examples due to rate limits, network issues, or other transient errors. Rather than re-running the entire evaluation, you can identify and retry only the failed examples on an experiment. This guide shows an approach to build retry logic into your evaluation workflow and to retry only the failed examples. You can use the error_handling='ignore' parameter to skip logging errored runs, then automatically identify unsuccessful examples and re-run them in Python.

Step 1. Run the initial evaluation

Run the initial evaluation, ignoring errors to prevent errored runs from being logged:

from langsmith import Client

client = Client()

# Run initial evaluation, ignoring errors
# error_handling='ignore' prevents errored runs from being logged
results = await client.aevaluate(
    target,
    data="dataset",
    evaluators=[your_evaluators],
    error_handling='ignore'
)

Step 2. Retry on failed examples and log to same experiment

Fetch all the unsuccessful examples:

# Identify unsuccessful examples
runs = client.list_runs(project_name=results.experiment_name)
successful_example_ids = [r.reference_example_id for r in runs]
unsuccessful_examples = (e for e in client.list_examples(dataset_name="dataset") if e.id not in successful_examples)

Next, re-run all the failed examples and log them to the same experiment:

# Retry only the failed examples, log
results_retry = await client.aevaluate(
    target,
    unsuccessful_examples,
    evaluators=[your_evaluators],
    experiment=results.experiment_name,
    error_handling='ignore'
)

Edit this page on GitHub or file an issue.

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

How to read experiment results locally

Run an evaluation with multimodal content

Datasets

Set up evaluations

Analyze experiment results

Annotation & human feedback

Common data types

How to retry failed runs in experiments (Python only)

Step 1. Run the initial evaluation

Step 2. Retry on failed examples and log to same experiment

Datasets

Set up evaluations

Analyze experiment results

Annotation & human feedback

Common data types

​Step 1. Run the initial evaluation

​Step 2. Retry on failed examples and log to same experiment

​Related topics

Step 1. Run the initial evaluation

Step 2. Retry on failed examples and log to same experiment

Related topics