No-Code Evals Quickstart
No-Code Evals Quickstart
Run your first evaluation in the platform UI — no code required.
No-Code Evals Quickstart
Run your first evaluation in the platform UI — no code required.
This quickstart walks you through running your first no-code evaluation on Confident AI. By the end, you’ll have:
A no-code evaluation workflow allows non-technical team members to run an end-to-end iteration of your AI app without leaving Confident AI.
You’ll need a Confident AI account to follow along. Sign up here if you haven’t already.
Run your first evaluation by following this example for a single-turn, QA use case:
A metric collection groups the metrics you want to evaluate together.
Start with 2-3 metrics for your first evaluation. You can always add more later.
Datasets contain the goldens you’ll use to generate AI outputs.
We’ll cover all the ways you can generate AI outputs in later sections.
For this quickstart, provide a hardcoded actual output (don’t worry, we won’t be doing this later):
Now let’s evaluate your goldens against your metrics.
The evaluation will process each test case and score it against your selected metrics.
Once your run an evaluation, you will be redirected to a test run. Wait for a moment for evaluation to complete, and ✅ done!. You’ve run your first no-code evaluation.
In the testing report, you can analyze:
In later sections, you can find out more on what a test run offers.
In the quickstart above, we hardcoded the actual output directly in the dataset. This is useful for quick tests, but highly not recommedned. This is because you should aim to test changes made to your AI app, not static outputs that are pre-computed.
Confident AI offers more powerful ways to generate outputs dynamically:
Single prompt generation — define a prompt template in the platform and Confident AI calls your configured LLM provider to generate outputs automatically. Ideal for testing prompt variations or comparing models.
AI Connections — connect directly to your deployed AI system. If it’s reachable via HTTP(s), it’s testable. Customize request payloads, parse custom response structures, and pass headers or auth tokens.
AI connections are powerful because it allows Confident AI to test your AI apps as they are. However, it does require an initial small setup time from engineering.
AI Connections let you test your actual AI system end-to-end, catching integration issues that prompt-only testing misses.
Now that you’ve completed a basic evaluation, learn how to handle different use cases: