For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Trust CenterStatusSupportGet a demoPlatform
DocumentationEvals API ReferenceIntegrations & OTELPlatform SettingsSelf-HostingChangelog
DocumentationEvals API ReferenceIntegrations & OTELPlatform SettingsSelf-HostingChangelog
  • Get Started
    • Introduction
    • Setup and Installation
  • LLM Evaluation
    • Introduction
    • Experiments
      • Manage Datasets
        • Automate Dataset Management
        • Pull Datasets
  • Metrics
    • Introduction
    • Metric Collections
    • Custom Metrics
  • LLM Tracing
    • Introduction
    • Signals
    • Troubleshooting
  • Human-in-the-Loop
    • Introduction
    • Collect Feedback
  • Reporting & Analytics
    • Dashboards
    • Executive Insights
  • Red Teaming
    • Introduction
    • Quickstart
    • Frameworks & Policies
    • Risk Profiles
    • Red Team Using DeepTeam
  • Resources
    • Why Confident AI
    • Support
    • Data Handling
    • LLM Use Cases
LogoLogo
Trust CenterStatusSupportGet a demoPlatform
On this page
  • Overview
  • Push Goldens
  • Add Custom Columns
  • Delete Dataset
  • Switching Projects
  • Next Steps
LLM EvaluationDatasetsDatasets in Code

Automate Dataset Management

Programmatically push goldens to datasets via the Evals API.
Was this page helpful?
Previous

Pull Datasets

Pull datasets locally to use them for evaluation.
Next
Built with

Overview

This section covers how to programmatically manage goldens in datasets using the Evals API:

  • Push single and multi-turn goldens to datasets
  • Set finalized=True to make goldens available for evaluation, or finalized=False to queue for review
  • Include custom column values when pushing goldens
  • Delete datasets programmatically
Only finalized goldens will be pulled for evaluation.

Push Goldens

Push goldens to a dataset. If the dataset does not already exist, Confident AI will create it for you.

Python
Typescript
curl

For single-turn datasets:

main.py
1from deepeval.dataset import EvaluationDataset, Golden
2
3goldens = [Golden(input="How tall is Mt. Everest?")]
4dataset = EvaluationDataset(goldens=goldens)
5
6# Push as finalized (ready for evaluation)
7dataset.push(alias="YOUR-DATASET-ALIAS", finalized=True)
8
9# Or push as unfinalized (queued for review)
10dataset.push(alias="YOUR-DATASET-ALIAS", finalized=False)

For multi-turn datasets:

With Turns
Without Turns
main.py
1from deepeval.dataset import EvaluationDataset, ConversationalGolden
2from deepeval.test_case import Turn
3
4goldens = [
5ConversationalGolden(
6scenario="Angry user asking for a refund.",
7turns=[Turn(role="user", content="Give me my money!")]
8)
9]
10dataset = EvaluationDataset(goldens=goldens)
11
12dataset.push(alias="YOUR-DATASET-ALIAS", finalized=True)

Add Custom Columns

You can include custom column values when pushing goldens. Custom columns must already exist on the dataset, or Confident AI will create them for you.

Python
Typescript
curl
main.py
1from deepeval.dataset import Golden, ConversationalGolden
2
3golden = Golden(
4 input="How tall is Mt. Everest?",
5 custom_column_key_values={"difficulty": "easy", "category": "geography"}
6)
7
8multiturn_golden = ConversationalGolden(
9 scenario="User asking for a refund.",
10 custom_column_key_values={"sentiment": "angry", "priority": "high"}
11)

Delete Dataset

Delete a dataset programmatically via the Evals API.

This action cannot be undone. All goldens or conversational goldens in the dataset will be permanently deleted.

Python
Typescript
curl
main.py
1from deepeval.dataset import EvaluationDataset
2
3dataset = EvaluationDataset()
4dataset.delete(alias="YOUR-DATASET-ALIAS")

Switching Projects

You can push or manage datasets in any project by configuring a CONFIDENT_API_KEY.

  • For default usage, set CONFIDENT_API_KEY as an environment variable.
  • To target a specific project, pass a confident_api_key directly when creating the EvaluationDataset.
Python
Typescript
main.py
1from deepeval.dataset import EvaluationDataset
2
3dataset = EvaluationDataset(confident_api_key="confident_us...")

When both are provided, the confident_api_key passed to EvaluationDataset always takes precedence over the environment variable.

Next Steps

Now that you know how to push goldens, learn how to pull them for evaluation.

Pull Datasets

Pull datasets locally to use them in code-driven evaluations.