For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Trust CenterStatusSupportGet a demoPlatform
DocumentationEvals API ReferenceIntegrations & OTELPlatform SettingsSelf-HostingChangelog
DocumentationEvals API ReferenceIntegrations & OTELPlatform SettingsSelf-HostingChangelog
    • Platform Settings
    • Data Residency
    • RBAC
  • Project Settings
    • API Keys
    • Team Members
    • Roles & Permissions
    • Transformers
    • Integrations
    • Alerts
    • AI Connections
    • Model Costs
    • Data Usage
    • Evaluation Models
    • Evaluation Rules
    • Annotation Options
    • Classifiers
    • Data Sources
    • Data Retention
    • Audit Logs
  • Organization Settings
    • Projects
    • Users
    • Roles & Permissions
    • Model Credentials
    • SSO
    • Data Retention
    • Audit Logs
    • Feature Access
LogoLogo
Trust CenterStatusSupportGet a demoPlatform
On this page
  • Select An Evaluation Model
  • Available Providers
Project Settings

Evaluation Models

Configure and manage the evaluation models used for running LLM-as-a-judge metrics in your project.

Was this page helpful?
Previous

Evaluation Rules

Automatically run metric collections on incoming traces, spans, and threads in your project.
Next
Built with

By default, Confident AI provides evaluation models for you to use for all evals run on the platform. You can however customize the evaluation model used to your liking.

Select Evaluation Model

Select An Evaluation Model

To configure your evaluation model:

  1. Navigate to Project Settings → Evaluation Model
  2. Select a Model Provider from the dropdown
  3. Select the specific Model to use (e.g., gpt-4o)
  4. Click Save to apply your changes

You can only select a provider if you have credentials configured for it. See the sections below to configure your providers.

Alternatively, toggle Inherit from Organization to use the model credentials configured at the organization level instead of configuring them per-project.

OpenAI gpt-5 model family. OpenAI requires the org whose key handles the call to be verified before it will serve gpt-5 (and certain other gated SKUs). If you select gpt-5 and the call returns a verification error:

  • BYO key — verify your own org at platform.openai.com under Organization → General.
  • Pooled (Confident-managed) key — pick a non-gated SKU like gpt-5.4 or gpt-5.4-mini instead, or contact support.

Other Confident features (Classifiers, Error Analysis, Test Run summarizers, Auto-Annotation, Executive Insights) all use gpt-5.4 / gpt-5.4-mini by default and aren’t affected unless you explicitly pick gpt-5.

Available Providers

There are three categories of providers you can configure. To set up any provider, click the three-dot menu (⋮) on the right side of the row and enter your API key or configuration details.

Model Providers — Provide your API key to run evaluations:

  • OpenAI
  • Anthropic
  • Gemini
  • X-AI
  • DeepSeek
  • Mistral
  • Perplexity

Cloud Providers — Run evaluations using models hosted on your cloud infrastructure:

  • Amazon Bedrock
  • Vertex AI

LLM Gateways — Connect a gateway to manage tag-based routing credentials:

  • Portkey
  • LiteLLM