LlamaIndex
Overview
LlamaIndex is an LLM framework that makes it easy to build knowledge agents from complex data. Confident AI allows you to trace and evaluate LlamaIndex agents in just a few lines of code.
Tracing Quickstart
Instrument LlamaIndex
Call instrument_llama_index once at startup, passing LlamaIndex’s root dispatcher. Every subsequent LlamaIndex call in your application will automatically be traced and sent to Confident AI.
instrument_llama_index registers DeepEval’s handler with LlamaIndex’s
instrumentation dispatcher. From that point on, all LlamaIndex spans and
events are captured automatically — no other code changes are required.
You can directly view the traces on Confident AI by clicking on the link printed in the console output.
What Gets Traced
The integration captures the following span types automatically:
Retrieval context from RetrievalEndEvent is automatically attached to the enclosing span, making it available for retrieval-based metrics.
Each span is tagged with integration: "LlamaIndex" so you can filter by framework on Confident AI.
Advanced Features
Set trace attributes
You can attach metadata, user identifiers, and other attributes to a trace by wrapping your LlamaIndex call inside the trace context manager.
View Trace Attributes
The name of the trace. Learn more.
Tags are string labels that help you group related traces. Learn more.
Attach arbitrary metadata to the trace. Learn more.
Supply the thread or conversation ID to view and evaluate conversations. Learn more.
Supply the user ID to enable user analytics. Learn more.
Override the top-level input recorded for this trace.
Override the top-level output recorded for this trace.
Explicitly set the retrieval context for this trace.
Contextual information available to the model at inference time.
The expected or ground-truth output for this trace.
Manually specify the tools called during this trace.
The expected tools that should have been called.
Each attribute is optional and works the same way as the native tracing features on Confident AI.
Evals Usage
Online evals
You can run online evals on your LlamaIndex application to evaluate all incoming traces on Confident AI’s servers. This approach is recommended when your application is in production.
Create metric collection
Create a metric collection on Confident AI with the metrics you wish to use to evaluate your LlamaIndex application.
The LlamaIndex integration automatically captures input and actual_output
for Agent and LLM spans. Use metrics that only require those fields (e.g.
Answer Relevancy, Task Completion) unless you also supply
retrieval_context, context, expected_output, or expected_tools
explicitly via the trace context manager or span context objects.
Span-level evals
For finer-grained control, you can attach metrics or a metric collection directly to individual Agent or LLM spans using AgentSpanContext or LlmSpanContext. This lets you evaluate specific spans independently.
View Span Context Parameters
AgentSpanContext — applied to Agent spans (i.e. workflow / agent .run() calls):
A list of DeepEval metric instances to evaluate this agent span with.
Name of a metric collection on Confident AI to use for evaluation.
The expected output for the agent span.
The expected tools that should have been called.
Contextual information for the span.
Retrieved documents or chunks for the span.
LlmSpanContext — applied to LLM spans (i.e. individual LLM calls):
A list of DeepEval metric instances to evaluate this LLM span with.
Name of a metric collection on Confident AI to use for evaluation.
A Prompt object from deepeval.prompt to associate a managed prompt with
this LLM span.
The expected output for the LLM span.
The expected tools that should have been called.
Contextual information for the span.
Retrieved documents or chunks for the span.
View on Confident AI
You can view the evals on Confident AI by clicking on the link in the output printed in the console.