For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Trust CenterStatusSupportGet a demoPlatform
DocumentationEvals API ReferenceIntegrations & OTELPlatform SettingsSelf-HostingChangelog
DocumentationEvals API ReferenceIntegrations & OTELPlatform SettingsSelf-HostingChangelog
  • Get Started
    • Introduction
    • Quickstart
    • Authentication
    • Data Models
    • API Conventions
  • Metrics
    • GETList Metrics
    • POSTCreate Metrics
    • PUTUpdate Metrics
    • POSTBatch Create
  • Metric Collections
    • GETList Metric Collections
    • POSTAdd Collection
    • PUTUpdate Collection
  • Datasets
    • GETList Datasets
    • GETPull Dataset
    • POSTPush Dataset
    • DELDelete Dataset
  • Evaluation
    • POSTRun LLM Evals
    • POSTSimulate Conversation
    • POSTEvaluate Span
    • POSTEvaluate Trace
    • POSTEvaluate Thread
    • GETRetrieve Test Run
    • GETList Test Runs
  • Tracing
    • GETList Traces
    • POSTTrace Ingestion
    • GETRetrieve Trace
    • GETList Spans
    • GETRetrieve Span
  • Threads
    • GETList Threads
    • GETRetrieve Thread
  • Prompt
    • GETList Prompts
    • POSTPush Prompts
    • GETPull Prompts By Label
    • GETPull Prompts By Version
    • GETPull Prompts By Commit
    • GETList Versions
    • POSTCreate Version
    • GETList Commits
    • GETList Branches
    • POSTCreate Branch
    • PUTUpdate Branch
    • DELDelete Branch
  • Metric Data
    • GETList Metrics Data
  • Annotations
    • GETList Annotations
    • POSTCreate Annotation
    • GETGet Annotation
    • PUTUpdate Annotation
  • Annotation Queues
    • GETList Annotation Queues
    • POSTCreate Annotation Queue
    • GETGet Annotation Queue
    • DELDelete Annotation Queue
    • GETList Queue Items
    • POSTAnnotate Queue Item
  • Projects
    • GETList Projects
    • POSTCreate Project
    • PUTUpdate Project
LogoLogo
Trust CenterStatusSupportGet a demoPlatform
Evaluation

Run LLM Evals

POST
https://api.confident-ai.com/v1/evaluate
POST
/v1/evaluate
$curl -X POST https://api.confident-ai.com/v1/evaluate \
> -H "CONFIDENT_API_KEY: <PROJECT-API-KEY>" \
> -H "Content-Type: application/json" \
> -d '{
> "metricCollection": "Collection Name",
> "llmTestCases": [
> {
> "input": "How tall is mount everest?",
> "actualOutput": "No clue, pretty tall I guess?"
> }
> ]
>}'
200Single-Turn
1{
2 "success": true,
3 "data": {
4 "id": "TEST-RUN-ID"
5 },
6 "deprecated": false
7}

Run online evals for your test cases using the metrics in metricCollection.

Was this page helpful?
Previous

Simulate Conversation

Next
Built with

Headers

CONFIDENT_API_KEYstringRequired
The API key of your Confident AI project.

Request

metricCollectionstringRequired
The name of the metric collection you wish to use for evaluation.
llmTestCaseslist of objectsOptional

This is a list of single-turn test cases to evaluate. If you are evaluating multi-turn test cases, this should be null.

conversationalTestCaseslist of objectsOptional

This is a list of multi-turn test cases to evaluate. If you are evaluating single-turn test cases, this should be null.

hyperparametersmap from strings to anyOptional
This is any hyperparameters like model or prompt you wish to associate with the test run.
identifierstringOptional
A unique identifier for the test run.

Response

This endpoint returns an object.
successboolean
This is true if the test cases were successfully evaluated.
dataobject
deprecatedboolean
This is true if this endpoint is deprecated.