Introduction to AI Governance | Confident AI Docs

Overview

AI Governance on Confident AI lets you turn your organization’s compliance requirements into policies that are continuously enforced across your projects. A policy is a group of controls — individual, measurable requirements that are automatically assessed against the real state of each project (its datasets, traces, alerts, test runs, risk assessments, and more).

Define your standard for evaluation, observability, and red teaming once, and apply it everywhere. Every project assigned to a policy is held to the same quality bar, so every team ships with confidence that their AI meets the standard your organization expects. Governance makes that bar explicit, consistent, and automatically enforced — giving everyone a shared definition of what “good” looks like.

This gives compliance and engineering teams a single source of truth for answering “Is this AI application allowed to ship?” — and lets you block deployments that don’t meet your standards.

AI Governance is an enterprise feature. Contact us if you’d like it enabled for your organization.

How it works

Define controls

A control is a single requirement, such as “traces are being logged”, “p95 latency stays under 2s”, or “the latest official red teaming assessment passed”. Controls are assessed automatically and resolve to a status.

Group controls into a policy

A policy is a named group of controls — typically mapped to a compliance framework such as the EU AI Act or NIST AI RMF. A policy is met only when all of its controls pass.

Assign projects to a policy

Each project belongs to at most one policy. Every project assigned to a policy is assessed against all of that policy’s controls.

Assess and gate

Assessments run automatically on a daily schedule and on demand. You can also run them as a deploy gate in CI/CD — blocking a release unless every control passes.

Core concepts

Governance Policies

A group of controls that maps to a compliance requirement. Each project belongs to one policy, and the policy is met only when every control passes.

Governance Controls

The individual requirements that get assessed. Spanning operational, runtime, and pre-deployment checks across evals and red teaming.

Control types

Controls come in four types, each covering a different slice of your AI lifecycle:

Type	What it checks
Operational	Static configuration checks — e.g. datasets exist, traces are logged, alerts are configured.
Runtime	Threshold-based metrics over your observability data (traces, spans, threads), much like alerts.
Pre-deployment (evals)	Gates on a recent test run — for example, requiring the latest official run to pass.
Pre-deployment (red teaming)	Gates on a recent risk assessment from your red teaming workflows.

See Controls for the full breakdown of each type and how they’re configured.

Assessment statuses

Every control assessment resolves to one of four statuses:

Status	Meaning
`PASS`	The control’s requirement is satisfied.
`FAIL`	The requirement is not satisfied — e.g. a check failed, a threshold was breached, or the gated run didn’t match.
`ERROR`	The assessment couldn’t run, usually due to a misconfigured control.
`NO_DATA`	There was no data to assess — e.g. no metrics in the window, or no qualifying run yet.

A policy is only considered met when every control resolves to PASS. Any FAIL, ERROR, or NO_DATA means the policy is not met.

Gating deployments

Enforce a policy in your CI/CD pipeline using the deepeval CLI (available in both Python and TypeScript), or call the public API directly. The gate assesses every control in the project’s policy and only passes if all of them pass:

Python

TypeScript

cURL

$ deepeval gate

The CLI exits with code 0 only when the policy is fully met, and a non-zero code otherwise. All three call the POST /v1/governance/assess endpoint under the hood using your project’s API key. A non-zero exit code stops your pipeline, preventing a non-compliant deployment from shipping.

Learn more

Policies — group controls and assign projects
Controls — the four control types and how to configure them
Alerts — the observability primitive behind runtime controls
Risk Profiles — what red teaming pre-deployment controls assess against