Introduction
Overview
Synthetic data generation allows you to automatically create high-quality goldens from your existing data sources — documents stored in Google Drive, messages in Slack channels, pages in Notion, or files in SharePoint. Instead of manually writing goldens one by one, you can connect a data source and let Confident AI generate evaluation-ready goldens at scale.
How It Works
At a high level, synthetic data generation follows three steps:
Connect a Data Source
Navigate to Project Settings > Data Sources and connect an external source such as Google Drive, Slack, Notion, or SharePoint. Each source type requires its own set of credentials.
Supported Data Sources
Connect a shared folder and generate goldens from .txt, .pdf, and .docx files.
Generate goldens from channel message histories.
Pull page content and generate goldens from your knowledge base.
Connect via Azure AD and generate goldens from SharePoint files.