When running evaluations on large datasets, you may encounter failures on a small subset of examples due to rate limits, network issues, or other transient errors. Rather than re-running the entire evaluation, you can identify and retry only the failed examples on an experiment. This guide shows an approach to build retry logic into your evaluation workflow and to retry only the failed examples. You can use theDocumentation Index
Fetch the complete documentation index at: https://docs.langchain.com/llms.txt
Use this file to discover all available pages before exploring further.
error_handling='ignore' parameter to skip logging errored runs, then automatically identify unsuccessful examples and re-run them in Python.
Step 1. Run the initial evaluation
Run the initial evaluation, ignoring errors to prevent errored runs from being logged:Step 2. Retry on failed examples and log to same experiment
Fetch all the unsuccessful examples:Related topics
- Run an evaluation
- Run an evaluation asynchronously
- Handle model rate limits
- Experiment configuration
- Evaluate existing experiment
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

