February 21, 2026 | Confident AI Docs

We Need to Talk. In Code.

TGIF! Thank god it’s features, here’s what we shipped this week:

Big week for the org-anized among us. Multi-turn evals go code-first, Vercel joins the family, and prompts finally get the observability they deserve.

Added

Code-Based Multi-Turn Evals - Introducing ConversationalTestCase for your codebase. All the power of multi-turn evaluation, now programmable. Time to have the talk with your chatbot—in code.
Vercel AI SDK Integration - Next.js devs, rejoice! Native integration with Vercel’s AI SDK means you can trace and evaluate your ai package calls with zero friction. Ship fast, eval faster.
Transformers on Retrievers & Tools - Transformers aren’t just for AI connection outputs anymore. Reshape retriever outputs and tool calls before evaluation. Your agentic RAG pipeline called—it wants its custom parsing back.
Organization-Wide Metrics - Define metrics at the org level and share them across all your teams. No more “wait, which faithfulness config are we using?” Standardize once, evaluate everywhere.

Changed

Prompt Observability - Track which prompts are running in production, when they were swapped, and how performance changed. Finally, prompt feedback on your prompts.