Trace-Level Detections
Trace-Level Detections
Per-span vulnerability findings generated during risk assessments on traced applications.
Trace-Level Detections
Per-span vulnerability findings generated during risk assessments on traced applications.
When your AI application is traced and linked to a risk assessment via test_case_id (or turn_id for multi-turn), Confident AI scans each span in the trace for vulnerability findings after the assessment completes. These per-span findings are called Detections.
This requires your AI application to be instrumented for tracing. See LLM Tracing Introduction to get started.
The trace scan runs once the assessment finalizes. No extra configuration is required beyond linking your traces to test cases.
test_case_id (single-turn) or turn_id (multi-turn) — see setup instructions belowWhen Confident AI sends an attack to your AI Connection, it includes a test_case_id in the request payload. Pass that ID into your tracing implementation so Confident AI can match the trace back to the correct test case.
Confident AI sends testCaseId (and turnId for multi-turn) automatically in the request payload. Forward it to your tracing setup.
Setup instructions and code examples:
A detection is a vulnerability finding attributed to a specific span. The assessment’s configured evaluation model analyzes each span’s input and output, together with its position in the execution tree, to determine whether a vulnerability was introduced.
Distinguishing materialized from mitigated requires the evaluation model
to reason across the parent-child span chain. More capable models handle this
more reliably in deep or complex trace trees.
Spans with detections show a shield icon in the trace tree. Click any span and open the Detections tab to see the full list of findings for that span — including outcome, vulnerability type, attack vector, and reason.
Detections are attributed to the span that introduced the vulnerability, not to parent or wrapper spans. For example, if a child LLM span generates harmful content and a parent guardrail span blocks it before output:
mitigated detectionThis means detections in multi-span pipelines reflect where the issue originated, not which spans happened to pass the output along.
test_case_id or turn_id.