Blog

Introducing Report Templates: Build the report your team actually reads

Jun 26, 2026·3 min read

Jeffrey Ip

Co-founder @ Confident AI. Creator of DeepEval & DeepTeam. Building an unhealthy LLM evals addiction. Ex-Googler (YouTube), Microsoft AI (Office365).

Introducing Report Templates: Build the report your team actually reads

Imagine opening a single report every morning that tells you exactly which part of your AI agent broke overnight, what your users were actually trying to do, and which traces to look at first — written the way your team needs to read it, not the way a dashboard decided to show it.

That report doesn't exist out of the box, because every team measures AI quality differently. So today we're launching Report Templates on Confident AI: you decide what gets analyzed, what gets summarized, and what your team sees when they open the report.

Confident AI: Introducing Report Templates — build the report your team actually reads

Design report templates with the pages, sections, stats, tables, and AI-generated summaries your team needs.

The problem: reports were too fixed

Every team wants to answer different questions.

One team might want a daily usage report that explains which metrics dropped overnight. Another might want a weekly exec summary. Another might want a deep dive into traces where answer relevancy failed, grouped by the user intent behind each request.

The data is already in Confident AI, but a fixed report format can only go so far. If the report does not match the way your team reviews agent quality, people end up exporting data, rewriting summaries, or building their own reporting flow outside the platform.

Build the report you want to see

Report Templates let you define the structure once and reuse it whenever you need a report.

You can add new pages, create custom sections, write prompts for AI-generated summaries, and choose the stats, charts, and tables that belong in the final report.

For an AI agent quality report, that might mean:

Overview — summarize what happened across the project
Key findings — call out the most important regressions and usage shifts
Stats — show test runs, pass rate, average cost, and the top failing metrics
Metric breakdowns — compare performance across answer relevancy, faithfulness, bias, toxicity, summarization, and more
Trace deep dives — surface the examples behind the numbers so your team can inspect what went wrong

The template controls the shape of the report, so the output matches the way your team actually reviews AI quality.

From traces to answers

The useful part is not just rendering a prettier report. It is turning the raw evaluation and observability data into something your team can act on.

Report Templates can dig into traces, identify common usage patterns, summarize the biggest changes, and organize the findings around the questions you care about.

That means a daily report can tell you:

Which part of your AI agent is not performing well
Which metrics are driving the drop
Which user requests are showing up most often
Which traces explain the pattern
What changed since the last reporting period

Instead of starting from a blank dashboard every morning, your team starts from a report built for the decision they need to make.

Get started

Report Templates are live on Confident AI now.

Open Project Settings -> Report Templates, create a template, and start adding the pages and sections your team wants to see.

Do you want to brainstorm how to evaluate your LLM (application)? Ask us anything in our discord. I might give you an "aha!" moment, who knows?

AI Quality for the entire organization, not just individual teams

Give all AI use cases the same quality bar with all-in-one evals, observability, and red teaming, and enforce them at scale.

AI evals for product teams, not just engineers.

Observability for production traffic.

Red teaming for security and safety.

AI governance for multiple projects at once.

Book a Demo Or sign up