Skip to main content
Use the LangSmith SDK to manage feedback configurations and annotation queue rubrics programmatically. Define reusable feedback schemas at the organization level (like accuracy scores or pass/fail judgments), then assign them to specific queues with custom instructions. This enables version control, automation across projects, and consistency—particularly useful for CI/CD pipelines or replicating evaluation setups across environments.
This guide uses the Python and TypeScript SDKs. For installation and setup, refer to the Python SDK documentation and TypeScript SDK documentation.

Feedback layers

LangSmith uses a three-layer architecture for structured human feedback:
  1. Feedback configs: Organization-wide definitions of feedback keys that establish the schema for evaluation metrics. For example, you might define “accuracy” as a continuous 0–1 score or “correctness” as a pass/fail categorical choice. These configs are reusable across all annotation queues in your organization.
  2. Annotation queue rubric items: Queue-specific assignments that determine which feedback configs annotators must fill out when reviewing runs in a particular queue. Each rubric item can include custom descriptions, guidance for specific score values, and whether the feedback is required or optional.
  3. Feedback: Individual scores and values that annotators submit on specific runs. This is the actual evaluation data collected using the schemas you’ve defined. Learn more about feedback in LangSmith.

Feedback configs

Create a feedback config

Feedback configs define the schema for a feedback key — whether it’s a continuous score, a categorical choice, or freeform text. A unique key identifies each config within your organization and specifies how annotators can submit feedback for that metric.
Calling create_feedback_config with an identical config that already exists returns the existing config. If a different config already exists for the same key, the system raises a 400 error.
from langsmith import Client

client = Client()

# Continuous score
client.create_feedback_config(
    "accuracy",
    feedback_config={
        "type": "continuous",
        "min": 0,
        "max": 1,
    },
    is_lower_score_better=False,
)

# Categorical
client.create_feedback_config(
    "correctness",
    feedback_config={
        "type": "categorical",
        "categories": [
            {"value": 1, "label": "Pass"},
            {"value": 0, "label": "Fail"},
        ],
    },
)

# Freeform text
client.create_feedback_config(
    "notes",
    feedback_config={"type": "freeform"},
)
  • Continuous ("accuracy"): Defines a numeric scale from 0 to 1. The is_lower_score_better parameter indicates whether lower values represent better performance. Use continuous configs for rating scales or percentage-based metrics.
  • Categorical ("correctness"): Provides predefined options with associated values. Each category requires a value (used for scoring and analytics) and a label (shown to annotators). Use categorical configs for binary choices or multi-class classifications.
  • Freeform ("notes"): Allows open-ended text input with no predefined structure. Use freeform configs for qualitative observations or explanations.

List feedback configs

Retrieve feedback configs to see what evaluation criteria are available in your organization with list_feedback_configs. You can list all configs or filter by specific keys. Each returned config object includes the key, type, configuration details (like min/max or categories), and metadata like is_lower_score_better:
# List all configs
for config in client.list_feedback_configs():
    print(f"{config.feedback_key}: {config.feedback_config}")

# Filter by specific keys
for config in client.list_feedback_configs(
    feedback_key=["accuracy", "correctness"]
):
    print(config.feedback_key)

Update a feedback config

Modify an existing feedback config with update_feedback_config by updating specific fields. The method only changes the fields you provide—the rest remain unchanged. This is a partial update that preserves other configuration settings:
client.update_feedback_config(
    "accuracy",
    is_lower_score_better=True,
)

Delete a feedback config

Remove a feedback config from your organization with delete_feedback_config. This performs a soft delete, which marks the config as deleted but doesn’t permanently remove it from the system. You can recreate a config with the same key later if needed:
client.delete_feedback_config("accuracy")

Annotation queue rubric items

Rubric items assign feedback configs to a specific annotation queue. They control which feedback forms annotators see when reviewing runs in that queue, and whether each form is required or optional.

Create a queue with rubric items

Create an annotation queue with create_annotation_queue and assign feedback configs to it through rubric items. Each rubric item references a feedback config by its key and customizes how it appears to annotators in this specific queue. The example creates a queue with three rubric items. The queue-level rubric_instructions provides general guidance shown at the top of the annotation interface:
queue = client.create_annotation_queue(
    name="QA Review Queue",
    description="Review LLM outputs for accuracy and correctness",
    rubric_instructions="Score each response. Add notes for anything unusual.",
    rubric_items=[
        {
            "feedback_key": "accuracy",
            "description": "How accurate is the response?",
            "score_descriptions": {
                "0": "Completely wrong",
                "1": "Perfectly accurate",
            },
            "is_required": True,
        },
        {
            "feedback_key": "correctness",
            "description": "Did the response pass or fail?",
            "value_descriptions": {
                "Pass": "Factually correct",
                "Fail": "Contains errors",
            },
            "is_required": True,
        },
        {
            "feedback_key": "notes",
            "description": "Any additional observations",
            "is_required": False,
        },
    ],
)
  • feedback_key: The key of an existing feedback config (create this first).
  • description: Queue-specific guidance for annotators about this metric.
  • score_descriptions / value_descriptions: Optional labels that explain what specific values mean (use score_descriptions for continuous configs, value_descriptions for categorical).
  • is_required: Whether annotators must complete this feedback before submitting.

Update rubric items on an existing queue

Modify the rubric items assigned to an annotation queue with update_annotation_queue. This operation replaces the entire rubric items list, so you must include all items you want to keep—the operation removes any items you don’t include. You’ll need the queue ID, which you get when you create the queue or by listing queues:
Updating rubric items replaces the full list. Include all items you want to keep.
client.update_annotation_queue(
    queue.id,
    rubric_items=[
        {"feedback_key": "accuracy", "is_required": True},
        {"feedback_key": "correctness", "is_required": True},
        {
            "feedback_key": "tone",
            "description": "Is the tone appropriate?",
            "is_required": False,
        },
    ],
)

Feedback config types (detailed)

Continuous

Continuous configs define numeric rating scales with minimum and maximum values. Annotators can select any value within the range, making this ideal for scoring dimensions like accuracy, quality, or relevance on a numeric scale:
# Simple continuous score
client.create_feedback_config(
    "accuracy",
    feedback_config={
        "type": "continuous",
        "min": 0,
        "max": 1,
    },
)

# Continuous with labeled points on the scale
client.create_feedback_config(
    "quality",
    feedback_config={
        "type": "continuous",
        "min": 1,
        "max": 5,
        "categories": [
            {"value": 1, "label": "Poor"},
            {"value": 3, "label": "Average"},
            {"value": 5, "label": "Excellent"},
        ],
    },
)
The first example shows a 0–1 scale without labels. The second example demonstrates adding categories with labeled anchor points on the scale (like “Poor”, “Average”, “Excellent”) to help annotators understand what different values represent. These labels are optional but can improve consistency in how annotators interpret the scale.

Categorical

Categorical configs provide a discrete set of predefined options for annotators to choose from. Each category must have a value (a numeric identifier used for scoring and analytics) and a label (the text shown to annotators). You must define at least 2 categories. Use categorical configs for binary decisions (pass/fail, correct/incorrect), multi-class classifications (sentiment, topic categories), or any evaluation with a fixed set of discrete options. Do not set min or max for categorical configs:
# Binary pass/fail
client.create_feedback_config(
    "correctness",
    feedback_config={
        "type": "categorical",
        "categories": [
            {"value": 1, "label": "Pass"},
            {"value": 0, "label": "Fail"},
        ],
    },
)

# Multi-class
client.create_feedback_config(
    "sentiment",
    feedback_config={
        "type": "categorical",
        "categories": [
            {"value": 0, "label": "Negative"},
            {"value": 1, "label": "Neutral"},
            {"value": 2, "label": "Positive"},
        ],
    },
)
The first example shows a binary pass/fail config. The second example demonstrates a multi-class config for sentiment with three options. The numeric values allow you to compute aggregate scores even for categorical feedback.

Freeform

Freeform configs allow annotators to provide open-ended text feedback without any predefined structure or constraints. This type has no min, max, or categories fields—annotators can enter any text they want. Freeform feedback is valuable for capturing nuanced insights but is harder to aggregate and analyze compared to structured feedback types:
client.create_feedback_config(
    "notes",
    feedback_config={"type": "freeform"},
)

Validation rules

Typemin/maxcategoriesConstraints
continuousOptionalOptional (labeled scale points)min < max; category values within [min, max]
categoricalMust not be setRequired, min 2Unique values and labels
freeformMust not be setMust not be setN/A

Reference

Feedback config types

TypeFieldsDescription
continuousmin, maxNumeric score within a range
categoricalcategories (list of {value, label})Selection from predefined options
freeformNoneFree-text input

Rubric item fields

FieldTypeDescription
feedback_keystringRequired. Must match an existing feedback config key.
descriptionstringShows annotators guidance for this item.
score_descriptionsRecord<string, string>Labels for specific score values (continuous).
value_descriptionsRecord<string, string>Labels for specific category values (categorical).
is_requiredbooleanWhether annotators must complete this item before submitting. Defaults to false.

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.