Governance Controls
A control is a single, measurable requirement that is automatically assessed against the real state of a project. Controls are grouped into policies, and a policy is met only when all of its controls pass.
Every assessment resolves to one of four statuses:
Control types
There are four control types, each assessing a different part of your AI lifecycle.
Static configuration checks — is the project set up the way your standards require?
Threshold-based checks over your observability metrics.
Gates on a recent evaluation test run.
Gates on a recent red teaming risk assessment.
Operational controls
Operational controls verify that a project is configured according to your standards. They are static checks — assessed purely from the current state of the project, with no thresholds to configure. These controls ship with the platform and can’t be created manually; you simply add the ones you need to a policy.
Operational controls cover areas such as observability, datasets, integrations, and threat detection:
Time-bound checks (logged traces and threads) evaluate the last 30 days of activity.
Runtime controls
Runtime controls assess a metric over your observability data and check it against a threshold — the same model used by alerts. They are evaluated over a trailing 24-hour window.
Configure a runtime control with:
- Data model — Trace, Span, or Thread
- Aggregation — the metric to compute; options depend on the data model (e.g. count, error rate, average/percentile latency, token cost, unique end users)
- Threshold — a direction (Above or Below) and a numeric value
- Filters — optionally narrow the data the control evaluates (environment, tags, metadata, and more)
The control fails when the aggregated metric crosses the threshold (above the value for Above, below it for Below), and resolves to NO_DATA when there’s no matching data in the window.
Runtime controls are ideal for codifying production SLAs — like keeping error rates or latency within bounds — directly into a governance policy.
Pre-deployment controls (evals)
These controls gate on a recent evaluation test run. A control passes when a qualifying test run exists and matches the configured filters.
Choose how the gating run is selected:
- Latest official run — gate on the most recent run marked official in the project, or
- Identifier + window — gate on the latest completed run matching a given identifier within a rolling window (7, 14, 30, or 90 days)
You can also apply filters (including hyperparameters) — the selected run must match them to pass.
Mark a test run as official to designate it as the source of truth for gating, instead of relying on an identifier and window.
Pre-deployment controls (red teaming)
These work exactly like the evals pre-deployment controls, but gate on a red teaming risk assessment instead of a test run. Select the gating assessment by its latest official result, or by identifier + window, and optionally apply filters.
This lets you require, for example, that the latest official risk assessment passed before a release is allowed to ship.
Versioning
Controls are versioned. Each time you change a control’s configuration, a new version is appended to its history. Assessments always run against the latest version, while older versions remain for audit purposes.
Next steps
- Policies — group controls and assign projects
- Introduction to AI Governance — how it all fits together