Automate Dataset Management
Overview
This section covers how to programmatically manage goldens in datasets using the Evals API:
- Push single and multi-turn goldens to datasets
- Set
finalized=Trueto make goldens available for evaluation, orfinalized=Falseto queue for review - Include custom column values when pushing goldens
- Delete datasets programmatically
Push Goldens
Push goldens to a dataset. If the dataset does not already exist, Confident AI will create it for you.
Python
Typescript
curl
For single-turn datasets:
For multi-turn datasets:
With Turns
Without Turns
Add Custom Columns
You can include custom column values when pushing goldens. Custom columns must already exist on the dataset, or Confident AI will create them for you.
Python
Typescript
curl
Versioning Datasets
Datasets support immutable, named versions so you can pin evaluation runs to a specific snapshot of goldens.
- Create a version to snapshot the current state of the dataset.
- Push without specifying
versionto add goldens to the latest version (or unversioned, if the dataset has no versions yet). - Push with
version=...to add goldens to a specific version. - Pull without
versionto read the latest version. Pull withversion=...to read a specific version. - Get versions to list all snapshots, newest first.
Create a version
Python
Typescript
curl
The first call to create_version backfills every existing unversioned golden onto the new version. Subsequent calls snapshot all goldens from the previous version (with new IDs) and auto-increment the version number.
List versions
Python
Typescript
curl
Push and pull a specific version
Python
Typescript
When version is omitted, push and pull operate on the latest version. If the dataset has no versions yet, push leaves goldens unversioned and pull returns those unversioned goldens with version: null.
Delete Dataset
Delete a dataset programmatically via the Evals API.
This action cannot be undone. All goldens or conversational goldens in the dataset will be permanently deleted.
Python
Typescript
curl
Switching Projects
You can push or manage datasets in any project by configuring a CONFIDENT_API_KEY.
- For default usage, set
CONFIDENT_API_KEYas an environment variable. - To target a specific project, pass a
confident_api_keydirectly when creating theEvaluationDataset.
Python
Typescript
When both are provided, the confident_api_key passed to EvaluationDataset always takes precedence over the environment variable.
Next Steps
Now that you know how to push goldens, learn how to pull them for evaluation.