Pydantic AI
Overview
Pydantic AI is a Python-native LLM agent framework built on the foundations of Pydantic validation. Confident AI allows you to trace and evaluate Pydantic AI agents using an OpenTelemetry-based integration.
Tracing Quickstart
For users in the EU region, please set the OTEL endpoint to the EU version:
Configure Pydantic AI
Pass DeepEvalInstrumentationSettings to your agent’s instrument parameter. This sets up the full OpenTelemetry pipeline — including span classification, trace context wiring, and export to Confident AI — in a single step.
Synchronous
Asynchronous
Streaming
DeepEvalInstrumentationSettings constructs a TracerProvider, registers the span processor pipeline, sets the global OTel tracer provider, and forwards itself to pydantic-ai’s Agent(instrument=...). The Confident AI API key is read automatically from the CONFIDENT_API_KEY environment variable or from deepeval login — you only need to pass api_key= explicitly if you manage keys programmatically.
Run Pydantic AI
Invoke your agent by executing the script:
You can view the traces on Confident AI by clicking on the link printed in the console.
Advanced Usage
Logging threads
Threads group related traces together and are useful for chat apps, agents, or any multi-turn interactions. You can learn more about threads here. Pass thread_id to DeepEvalInstrumentationSettings to associate every trace from that agent with a thread.
Trace attributes
You can attach trace-level attributes such as name, tags, metadata, and user ID to every trace produced by the agent. These are baked into DeepEvalInstrumentationSettings as static defaults. They can be overridden at runtime using update_current_trace(...) from inside a tool body.
View Trace Attributes
Your Confident AI API key. Falls back to the CONFIDENT_API_KEY environment variable or deepeval login.
The default name for traces produced by this agent. Learn more.
String labels that help you group related traces. Learn more.
Arbitrary metadata attached to each trace. At runtime, update_current_trace(metadata=...) is merged on top of this base. Learn more.
Conversation or session ID for grouping multi-turn traces. Learn more.
User identifier for user-level analytics. Learn more.
Name of the metric collection to run online evals against each trace.
Associates a trace with a specific test case.
Identifies a specific turn within a multi-turn conversation.
All attributes are optional. They work the same way as the native tracing
features on Confident AI. Any field set here
is overridable at runtime via update_current_trace(...) from inside a tool body.
Update trace attributes
You can enrich a trace mid-flight from inside a tool body using update_current_trace. This is useful when trace metadata depends on information only available during execution, such as a user ID resolved by a lookup tool.
update_current_trace(...) is safe to call from any tool body, including async tools and tools running in worker threads. The implicit trace context created by DeepEvalInstrumentationSettings is propagated automatically via Python contextvars.
Update span attributes
You can attach span-level attributes such as metadata or a metric collection from inside a tool body using update_current_span. This is the primary way to configure per-tool evaluation behavior.
Logging prompts
If you are managing prompts on Confident AI and wish to log them, use next_llm_span to associate a Prompt with the next LLM span before calling your agent.
Be sure to pull the prompt before logging it, otherwise the prompt will
not be visible on Confident AI. next_llm_span is one-shot — it is consumed
by the next LLM span produced inside the with block.
Per-call trace context
To set per-call trace attributes (such as a different user_id per request), wrap each agent invocation in with trace(...). This also switches routing to Confident AI’s REST transport.
Sending annotations
Send human annotations on traces or threads on Confident AI. Learn more about sending annotations.
Traces
Threads
Evals Usage
Online evals
You can run online evals on your Pydantic AI agent. Online evals run evaluations on all incoming traces on Confident AI’s servers and are the recommended approach for production agents.
Create metric collection
Create a metric collection on Confident AI with the metrics you want to use to evaluate your agent.
Your metric collection should only contain metrics that evaluate the input and output of the span or trace you are targeting.
Run evals
You can run online evals at the trace level or the span level. Pass the metric_collection parameter to the appropriate target.
Trace
Agent Span
Tool Span
Pass metric_collection to DeepEvalInstrumentationSettings to evaluate every trace produced by the agent.
All incoming traces and spans will now be evaluated using metrics from your metric collection.
You can view eval results on Confident AI by clicking on the link printed in the console.