For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Trust CenterStatusSupportGet a demoPlatform
DocumentationEvals API ReferenceIntegrations & OTELPlatform SettingsSelf-HostingChangelog
DocumentationEvals API ReferenceIntegrations & OTELPlatform SettingsSelf-HostingChangelog
  • Get Started
    • Introduction
    • Setup and Installation
  • LLM Evaluation
    • Introduction
    • Experiments
  • Metrics
    • Introduction
    • Metric Collections
    • Custom Metrics
  • LLM Tracing
    • Introduction
    • Signals
    • Troubleshooting
  • Human-in-the-Loop
    • Introduction
    • Collect Feedback
  • Reporting & Analytics
    • Dashboards
    • Executive Insights
  • Red Teaming
    • Introduction
    • Quickstart
    • Frameworks & Policies
    • Risk Profiles
    • Trace-Level Detections
    • Red Team Using DeepTeam
  • Resources
    • Why Confident AI
    • Support
    • Data Handling
    • LLM Use Cases
LogoLogo
Trust CenterStatusSupportGet a demoPlatform
On this page
  • Overview
  • How it works
  • Prerequisites
  • Linking traces to test cases
  • Detections
  • Outcomes
  • Viewing detections
  • Span attribution
  • Notes
  • Next steps
Red Teaming

Trace-Level Detections

Per-span vulnerability findings generated during risk assessments on traced applications.

Was this page helpful?
Previous

Red Team Using DeepTeam

Next
Built with

Overview

When your AI application is traced and linked to a risk assessment via test_case_id (or turn_id for multi-turn), Confident AI scans each span in the trace for vulnerability findings after the assessment completes. These per-span findings are called Detections.

This requires your AI application to be instrumented for tracing. See LLM Tracing Introduction to get started.

How it works

The trace scan runs once the assessment finalizes. No extra configuration is required beyond linking your traces to test cases.

Prerequisites

  • Your AI application instrumented for tracing on Confident AI — see LLM Tracing Introduction
  • Each trace linked to its test case using test_case_id (single-turn) or turn_id (multi-turn) — see setup instructions below

Linking traces to test cases

When Confident AI sends an attack to your AI Connection, it includes a test_case_id in the request payload. Pass that ID into your tracing implementation so Confident AI can match the trace back to the correct test case.

AI Connections
OpenTelemetry

Confident AI sends testCaseId (and turnId for multi-turn) automatically in the request payload. Forward it to your tracing setup.

Setup instructions and code examples:

  • Linking test cases to traces (single-turn)
  • Linking turns to traces (multi-turn)

Detections

A detection is a vulnerability finding attributed to a specific span. The assessment’s configured evaluation model analyzes each span’s input and output, together with its position in the execution tree, to determine whether a vulnerability was introduced.

Outcomes

OutcomeDescription
materializedThe span produced violating content and it reached the user — no downstream span caught it.
mitigatedThe span produced violating content but a downstream span sanitized, blocked, or replaced it before the final output.
attemptedA clear attempt to introduce the vulnerability, but no breach occurred.

Distinguishing materialized from mitigated requires the evaluation model to reason across the parent-child span chain. More capable models handle this more reliably in deep or complex trace trees.

Viewing detections

Spans with detections show a shield icon in the trace tree. Click any span and open the Detections tab to see the full list of findings for that span — including outcome, vulnerability type, attack vector, and reason.

Shield icons in the trace tree and the Detections tab in the span detail panel

Span attribution

Detections are attributed to the span that introduced the vulnerability, not to parent or wrapper spans. For example, if a child LLM span generates harmful content and a parent guardrail span blocks it before output:

  • The child LLM span gets a mitigated detection
  • The parent span gets no detection

This means detections in multi-span pipelines reflect where the issue originated, not which spans happened to pass the output along.

Notes

  • Trace scanning runs alongside the standard pass/fail evaluation on the test case’s final output. Both appear in the assessment view.
  • The trace scan uses the vulnerability definitions from your security framework, including any custom vulnerabilities.
  • Detections are generated for any traced application with traces linked via test_case_id or turn_id.

Next steps

Risk Profiles

View CVSS scores, vulnerability coverage, and exploitability breakdowns across your assessments.

LLM Tracing Introduction

Instrument your AI application for tracing on Confident AI.