Evaluation Models

Configure and manage the evaluation models used for running LLM-as-a-judge metrics in your project.

By default, Confident AI provides evaluation models for you to use for all evals run on the platform. You can however customize the evaluation model used to your liking.

Select Evaluation Model

Select An Evaluation Model

To configure your evaluation model:

  1. Navigate to Project SettingsEvaluation Model
  2. Select a Model Provider from the dropdown
  3. Select the specific Model to use (e.g., gpt-4o)
  4. Click Save to apply your changes

You can only select a provider if you have credentials configured for it. See the sections below to configure your providers.

Alternatively, toggle Inherit from Organization to use the model credentials configured at the organization level instead of configuring them per-project.

Available Providers

There are three categories of providers you can configure. To set up any provider, click the three-dot menu (⋮) on the right side of the row and enter your API key or configuration details.

Model Providers — Provide your API key to run evaluations:

  • OpenAI
  • Anthropic
  • Gemini
  • X-AI
  • DeepSeek
  • Mistral
  • Perplexity

Cloud Providers — Run evaluations using models hosted on your cloud infrastructure:

  • Amazon Bedrock
  • Vertex AI

LLM Gateways — Connect a gateway to manage tag-based routing credentials:

  • Portkey
  • LiteLLM