Quickstart | Confident AI Docs

Overview

Confident AI’s Evals API allows you to run online evaluations on test cases, traces, spans, and threads. This 5-minute quickstart will allow you to run your first evaluation by walking you through:

Create a metric collection
Use the /v1/evaluate endpoint to create a test run

Run Your First Eval

Here’s a step-by-step guide on how to run your first online evaluation using the Evals API.

Get your API key

Create a free account at https://app.confident-ai.com, and get your Project API Key.

Make sure you’re not copying your organization API key.

Create a metric collection

You can create a metric collection containing the metric you wish to run evals with using the POST /v1/metric-collections endpoint. Note that all metric collections must have a unique name within your project.

POST

/v1/metric-collections

1 import requests
2 
3 url = "https://api.confident-ai.com/v1/metric-collections"
4 
5 payload = {
6     "name": "Collection Name",
7     "multiTurn": False,
8     "metricSettings": [
9         {
10             "metric": { "name": "Answer Relevancy" },
11             "threshold": 0.8
12         }
13     ]
14 }
15 headers = {
16     "CONFIDENT_API_KEY": "<PROJECT-API-KEY>",
17     "Content-Type": "application/json"
18 }
19 
20 response = requests.post(url, json=payload, headers=headers)
21 
22 print(response.json())

Try it

Create test run

To run an evaluation, provide the name of the metric collection a list of "llmTestCases" in your request body to run single-turn evaluations.

POST

/v1/evaluate

1 import requests
2 
3 url = "https://api.confident-ai.com/v1/evaluate"
4 
5 payload = {
6     "metricCollection": "Collection Name",
7     "llmTestCases": [
8         {
9             "input": "How tall is mount everest?",
10             "actualOutput": "No clue, pretty tall I guess?"
11         }
12     ]
13 }
14 headers = {
15     "CONFIDENT_API_KEY": "<PROJECT-API-KEY>",
16     "Content-Type": "application/json"
17 }
18 
19 response = requests.post(url, json=payload, headers=headers)
20 
21 print(response.json())

Try it

🎉 Congratulations! You just successfully ran your first evaluation on Confident AI via the Evals API.

The /v1/evaluate API endpoint will create a test run on Confident AI and return the following response:

Response

1 {
2   "success": true,
3   "data": {
4     "id": "TEST-RUN-ID"
5   },
6   "deprecated": false
7 }

Verify test run on the UI

After running an eval using our Evals API, your test results will be automatically stored on the Confident AI platform in a comprehensive report format. You can also separate the test results using the TEST-RUN-ID from the API response.

Test Reports on Confident AI

Next Steps

Now that you’ve run your first online evaluation, explore these next steps to go deeper with Confident AI:

Custom Datasets — Create custom datasets using the datasets endpoint.
Prompt Templates — Iterate and version your LLM prompts directly through the prompts endpoint.
Human Annotations — Annotate your evaluations to enable human-in-the-loop feedback to guide metric tuning and reinforce quality with the annotation endpoint.