Manage Datasets
Overview
A dataset, which is either single or multi-turn one, is a list of goldens and forms the basis of any evaluation workflow in development. In this section, you’ll learn to manipulate goldens in datasets, including:
- Understanding the golden structure for single and multi-turn datasets
- Uploading goldens via CSV on the platform
- Assigning different team members to review and finalize goldens
If you haven’t already, you should get yourself familiarized with what are goldens.
Create A Dataset
A dataset can be created one under Project > Datasets (select either the single or multi-turn tab based on the type of dataset you wish to create):
Golden Structure
Understanding the golden structure is essential before uploading your data. Goldens are the building blocks of datasets, and their structure differs slightly between single-turn and multi-turn evaluations:
Single-Turn
Multi-Turn
Avoid pre-populating Actual Output, Retrieval Context, or Tools Called for single-turn goldens, and Turns for multi-turn goldens. These fields are meant to be populated dynamically during evaluation.
Upload Goldens via CSV
You can upload both single and multi-turn goldens stored in CSVs to datasets. The fields that you will be mapping to CSV headers will just be slightly different.
Other Actions
Beyond creating and uploading, you can also:
- Add Images — drag and drop images into text fields for multi-modal goldens
- Edit Non-Text Columns — modify structured fields like Context, Expected Tools, and Tools Called
- Add Custom Columns — extend goldens with additional metadata fields
- Assign Goldens — delegate review to team members
- (Un)finalize Goldens — enable or disable goldens for testing
- Duplicate Dataset — create a copy of an existing dataset
- Delete Dataset — permanently remove a dataset
Adding Images
Datasets on Confident AI are multi-modal by nature — images are natively supported alongside text. You can add images to goldens by dragging and dropping them directly into any text field, including Input, Expected Output, Context, and other list-of-text fields.
When you upload an image, Confident AI stores it and generates a public URL. This URL is embedded in your golden’s text fields using a special format: [DEEPEVAL:IMAGE:uuid]. When you pull the dataset for evaluation, you can parse these into an evaluatable format.
Learn how to parse multi-modal goldens into an evaluatable format when pulling datasets in code.
Edit Non-Text Columns
Some golden fields require structured data rather than plain text. This is mostly relevant for single-turn datasets — multi-turn datasets only have Context.
A ToolCall object has the following structure:
Add Custom Columns
Add custom columns to your dataset to store additional metadata. Custom columns appear as new fields on each golden and can be used for passing dynamic values during evaluation.
Your custom columns must not be one of the default fields:
Single-Turn
Multi-Turn
- Input
- Expected Output
- Context
- Expected Tools
- Additional Metadata
- Comments
- Actual Output
- Retrieval Context
- Tools Called
Assign Goldens
Assign goldens to different team members for review and annotation.
(Un)finalize Goldens
Mark goldens as finalized to lock them from further edits, or unfinalize to allow changes. Finalizing is useful when you’ve reviewed and approved goldens for use in evaluations.
Duplicate Dataset
Create a copy of an existing dataset. Useful when you want to create variations or preserve a snapshot before making changes.
Delete Dataset
Remove a dataset permanently on the platform:
This action cannot be undone. All goldens or conversational goldens in the dataset will be permanently deleted.
Next Steps
Now that you know how to manage datasets on the platform, learn how to use them for evaluations or work with them programmatically in your code.