Push, Queue, and Annotate Goldens
Overview
A dataset, single or multi-turn one, is a list of goldens and forms the basis of any evaluation workflow in development. In this section, you’ll learn to manipulate goldens in datasets, including:
- Uploading goldens via CSV on the platform
- Build an automated golden ingestion pipline via the Evals API
- Assiging different team members to review and finalize goldens
If you haven’t already, you should get yourself familiarized with what are goldens.
If you haven’t already, create one under Project > Datasets:
Upload Goldens via CSV
You can upload both single and multi-turn goldens stored in CSVs to datasets. The fields that you will be mapping to CSV headers will just be slightly different.
Manage Goldens via Evals API
If you wish to upload goldens programmatically instead, you can leverage Confident AI’s Evals API. You can either push goldens in the “finalized” state, or queue them to mark them “unfinalized”.
Push goldens
If the dataset does not already exist, Confident AI will create it for you.
Python
Typescript
curL
For single-turn datasets, push single-turn goldens:
For multi-turn datasets, push multi-turn goldens:
With Turns
Without Turns
Queue goldens
If the dataset does not already exist, Confident AI will create it for you.
Python
Typescript
curL
For single-turn datasets, queue single-turn goldens:
For multi-turn datasets, queue multi-turn goldens:
With Turns
Without Turns
Custom Dataset Columns
You can add custom columns to a dataset to hold additional additation data for each golden as long as they don’t clash with any of the existing default field names (e.g. “Input”, “Actual Output”, etc.)
You can also do it through the Evals API when pushing or queueing goldens by including the custom column key values field in single/multi-turn goldens:
Python
Typescript
curL
Assign Goldens For Annotation
You can also assign goldens to different team members for review and annotation.