Skip to main content
Plan restrictions applyPlease note that the Data Export functionality is only supported for LangSmith Plus or Enterprise tiers.
LangSmith’s bulk data export lets you export trace data from a specific project and date range to an S3-compatible bucket in Parquet format, matching the fields in the Run data format. This is useful for offline analysis in tools like BigQuery, Snowflake, Redshift, or Jupyter Notebooks. This page covers how to:
  • Create an export destination
  • Create and configure an export job, including scheduled exports and field filtering
  • Monitor export progress
Before you start: exports may take some time depending on data volume, and LangSmith limits how many exports can run concurrently. Bulk exports have a 72-hour runtime timeout—refer to Automatic retry behavior for details. Once launched, LangSmith handles orchestration and resilience of the export process automatically.

1. Create a destination

The destination tells LangSmith where to write your exported data. Before making this request, you will need:
  • Your LangSmith API key and workspace ID.
  • An S3 or S3-compatible bucket with write access granted to LangSmith (refer to Permissions required).
  • The bucket name, prefix, and either the AWS region (for AWS S3) or the endpoint URL (for GCS, MinIO, or other S3-compatible providers).
  • An access key and secret key for the bucket.
curl --request POST \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports/destinations' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID' \
  --data '{
    "destination_type": "s3",
    "display_name": "My S3 Destination",
    "config": {
      "bucket_name": "your-s3-bucket-name",
      "prefix": "root_folder_prefix",
      "region": "your aws s3 region",
      "endpoint_url": "your endpoint url for s3 compatible buckets"
    },
    "credentials": {
      "access_key_id": "YOUR_S3_ACCESS_KEY_ID",
      "secret_access_key": "YOUR_S3_SECRET_ACCESS_KEY"
    }
  }'
Credentials are stored securely in encrypted form. The API will validate that the destination and credentials are valid before saving. If the request fails, refer to Debug destination errors. Save the id from the response; you will need it when creating an export job. Refer to Manage bulk export destinations for permissions setup, provider-specific configuration (AWS S3, GCS, MinIO), and credential options.

2. Create an export job

An export job targets a specific project and date range. You will need:
  • The destination id from the previous step.
  • The project ID (session_id)—copy this from the individual project view in the Tracing Projects list.
  • A start_time and end_time in UTC ISO 8601 format.
curl --request POST \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID' \
  --data '{
    "bulk_export_destination_id": "your_destination_id",
    "session_id": "project_uuid",
    "start_time": "2024-01-01T00:00:00Z",
    "end_time": "2024-01-03T00:00:00Z",
    "format_version": "v2_beta"
  }'
The start_time is inclusive and end_time is exclusive. The export will include all runs where run.start_time >= start_time and run.start_time < end_time. Save the id from the response to monitor the export’s progress. You can optionally add a filter expression to narrow the set of runs exported. Refer to our filter query language and examples for syntax. Not setting the filter field will export all runs.

Schedule recurring exports

Requires LangSmith Helm version >= 0.10.42 (application version >= 0.10.109)
Scheduled exports collect runs periodically and export to the configured destination. To create a scheduled export, include interval_hours and omit end_time:
curl --request POST \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID' \
  --data '{
    "bulk_export_destination_id": "your_destination_id",
    "session_id": "project_uuid",
    "start_time": "2024-01-01T00:00:00Z",
    "interval_hours": 1,
    "format_version": "v2_beta"
  }'
  • interval_hours must be between 1 and 168 (1 week) inclusive.
  • end_time must be omitted for scheduled exports; it is still required for one-time exports.
  • Each spawned export covers start_time to start_time + interval_hours, then advances by interval_hours for each subsequent run. Since end_time is exclusive, consecutive exports do not overlap.
  • Spawned exports run at end_time + 10 minutes to account for runs submitted with end_time in the recent past.
  • Spawned exports have the source_bulk_export_id attribute filled. If desired, they must be cancelled separately—cancelling the source export does not cancel already-spawned exports.
  • To stop a scheduled export, cancel it.
Example If a scheduled bulk export is created with start_time=2025-07-16T00:00:00Z and interval_hours=6:
ExportStart TimeEnd TimeRuns At
12025-07-16T00:00:00Z2025-07-16T06:00:00Z2025-07-16T06:10:00Z
22025-07-16T06:00:00Z2025-07-16T12:00:00Z2025-07-16T12:10:00Z
32025-07-16T12:00:00Z2025-07-16T18:00:00Z2025-07-16T18:10:00Z

Limit exported fields

Requires LangSmith Helm version >= 0.12.11 (application version >= 0.12.42). Supported in both one-time and scheduled exports.
You can improve export speed and reduce file size by limiting which fields are included using the export_fields parameter. When omitted, all fields are included.
curl --request POST \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID' \
  --data '{
    "bulk_export_destination_id": "your_destination_id",
    "session_id": "project_uuid",
    "start_time": "2024-01-01T00:00:00Z",
    "end_time": "2024-01-03T00:00:00Z",
    "export_fields": ["id", "name", "run_type", "start_time", "end_time", "status", "total_tokens", "total_cost"],
    "format_version": "v2_beta"
  }'
Excluding inputs and outputs can significantly improve export performance and reduce file sizes, especially for large runs. Only include these fields if you need them for your analysis.

Exportable fields

By default, bulk exports include the following fields for each run: Identifiers & hierarchy:
FieldDescription
idRun ID
tenant_idWorkspace/tenant ID
session_idProject/session ID
trace_idTrace ID
parent_run_idParent run ID
parent_run_idsList of all parent run IDs
reference_example_idReference to example if part of a dataset
Basic metadata:
FieldDescription
nameRun name
run_typeType of run (e.g., “chain”, “llm”, “tool”)
start_timeStart timestamp (UTC)
end_timeEnd timestamp (UTC)
statusRun status (e.g., “success”, “error”)
is_rootWhether this is a root-level run
dotted_orderHierarchical ordering string
trace_tierTrace tier/retention level
Run data:
FieldDescription
inputsRun inputs (JSON)
outputsRun outputs (JSON)
errorError message if failed
extraExtra metadata (JSON)
eventsRun events (JSON)
Tags & feedback:
FieldDescription
tagsList of tags
feedback_statsFeedback statistics (JSON)
Token usage & costs:
FieldDescription
total_tokensTotal token count
prompt_tokensPrompt token count
completion_tokensCompletion token count
total_costTotal cost
prompt_costPrompt cost
completion_costCompletion cost
first_token_timeTime to first token

Partitioning scheme

Data is exported into your bucket using the following Hive partitioned structure:
<bucket>/<prefix>/export_id=<export_id>/tenant_id=<tenant_id>/session_id=<session_id>/runs/year=<year>/month=<month>/day=<day>

3. Monitor your export

Poll the export status using the id from the previous step:
curl --request GET \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports/{export_id}' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID'
The status field in the response will be one of CREATED, RUNNING, COMPLETED, FAILED, CANCELLED, or TIMEDOUT. Exports may take some time depending on the volume of data. Once the status is COMPLETED, the Parquet files are available in your bucket. Refer to Monitor and troubleshoot bulk exports for how to list runs, stop an export, and diagnose failures.