Stay Confident

Subscribe to our weekly newsletter to stay confident in the AI systems you build.

All Stories Featured Evaluation Safety Product

Introducing Report Templates: Build the report your team actually reads

Report Templates let you customize the reports Confident AI generates for your team. Build daily reports that dig into traces, identify where your AI agent is underperforming, summarize common usage patterns, and show the exact pages and sections you care about.

Jeffrey Ip

Jun 26, 2026

3 min read

Introducing Synthetic Data Generation Pipelines: Customize how you generate data

Many teams already had great synthetic data generation pipelines running locally, but consolidating that work on one platform usually meant giving up flexibility. Synthetic Data Generation Pipelines bring that control into Confident AI: choose the sources to draw context from, wire them together, and tune each generation step.

Jeffrey Ip

Jun 25, 2026

3 min read

Introducing Annotation Forms: Capture any human feedback without leaving Confident AI

Human review only helps if everyone captures the same thing. Annotation Forms let you define the exact set of fields reviewers fill in — text, numbers, scales, yes/no, single and multiple choice, and scored criteria — so every annotation comes back structured, consistent, and ready to act on.

Jeffrey Ip

Jun 24, 2026

5 min read

Introducing AI Observability Workflows: Custom automations for every trace on the platform

Dataset ingestion, queue ingestion, evaluation rules, and classifiers have lived on Confident AI for a while — but in separate corners of the product. Workflows brings them into one interface: a single graph of your post-ingestion pipeline, with a tab to configure each task. Here's how it works.

Jeffrey Ip

Jun 23, 2026

5 min read

Introducing AI Governance: Standardized evals, policies, and controls

As AI spreads across an org, every team evaluates differently and no one can answer 'is this ready to ship?'. AI Governance is the layer on top of the evals, observability, and red teaming your teams already run — turning those signals into one standard, enforced at deploy time.

Jeffrey Ip

Jun 22, 2026

5 min read

Launch Week Day 5 (5/5): Generate Datasets from Your Data Sources

Your best evaluation data already exists — it's sitting in Google Drive, SharePoint, Notion, and S3. Dataset generation on Confident AI turns your existing documents into evaluation-ready datasets automatically.

Jeffrey Ip

Apr 4, 2026

4 min read

Launch Week Day 4 (4/5): Auto-Categorize Traces & Threads

You can't improve what you can't see. Auto-categorization tells you what your users are actually asking, detects response drift, and shows you which categories perform best — and which ones need help.

Jeffrey Ip

Apr 3, 2026

4 min read

Launch Week Day 3 (3/5): Auto-Ingest Traces into Datasets & Annotation Queues

Production traces are the best dataset you’ll ever get — but most teams never turn them into one. With auto-ingest, your traces flow straight into datasets and annotation queues, continuously.

Brian Romain

Apr 2, 2026

4 min read

Launch Week Day 2 (2/5): Scheduled Evals

Everyone agrees evals should run regularly. But nobody remembers to actually run them. Scheduled Evals fixes that — set the frequency, configure your mappings, and never scramble before a release again.

Kritin Vongthongsri

Apr 1, 2026

3 min read

Announcing Launch Week Q1 '26! Day 1: Automated Error Analysis

Error analysis used to mean pulling traces in code, hacking together an LLM to recommend metrics, and hoping for the best. Confident AI now does it for you.

Jeffrey Ip

Mar 31, 2026

4 min read