LlamaIndex

Use Confident AI for LLM observability and evals for LlamaIndex

Overview

LlamaIndex is an LLM framework that makes it easy to build knowledge agents from complex data. Confident AI allows you to trace and evaluate LlamaIndex agents in just a few lines of code.

Tracing Quickstart

1

Install Dependencies

Run the following command to install the required packages:

$pip install -U deepeval llama-index
2

Setup Confident AI Key

Login to Confident AI using your Confident API key.

$deepeval login
3

Configure LlamaIndex

Instrument LlamaIndex using instrument_llama_index to enable Confident AI’s LlamaIndexHandler.

main.py
1import asyncio
2from llama_index.llms.openai import OpenAI
3from llama_index.core.agent import FunctionAgent
4import llama_index.core.instrumentation as instrument
5
6from deepeval.integrations.llama_index import instrument_llama_index
7instrument_llama_index(instrument.get_dispatcher())
8
9def multiply(a: float, b: float) -> float:
10 """Useful for multiplying two numbers."""
11 return a * b
12
13agent = FunctionAgent(
14 tools=[multiply],
15 llm=OpenAI(model="gpt-4o-mini"),
16 system_prompt="You are a helpful assistant that can perform calculations.",
17)
18
19async def llm_app(input: str):
20 return await agent.run(input)
21
22asyncio.run(llm_app("What is 3 * 12?"))

Now whenever you use LlamaIndex, DeepEval will collect LlamaIndex traces and publish them to Confident AI.

You can directly view the traces on Confident AI by clicking on the link in the output printed in the console.

Evals Usage

Online evals

You can run online evals on your LlamaIndex agent, which will run evaluations on all incoming traces on Confident AI’s servers. This approach is recommended if your agent is in production.

1

Create metric collection

Create a metric collection on Confident AI with the metrics you wish to use to evaluate your LlamaIndex agent.

Create metric collection

Your metric collection should only contain metrics that don’t require retrieval_context, context, expected_output, or expected_tools for evaluation.

2

Run evals

Confident AI supports online evals for LlamaIndex applications. Evaluations are configured using metric_collection as an argument on the trace context which applies to all spans emitted during the trace.

main.py
1import asyncio
2from llama_index.llms.openai import OpenAI
3from llama_index.core.agent import FunctionAgent
4import llama_index.core.instrumentation as instrument
5from deepeval.integrations.llama_index import instrument_llama_index
6from deepeval.tracing import trace
7
8instrument_llama_index(instrument.get_dispatcher())
9
10def multiply(a: float, b: float) -> float:
11 """Useful for multiplying two numbers."""
12 return a * b
13
14agent = FunctionAgent(
15 tools=[multiply],
16 llm=OpenAI(model="gpt-4o-mini"),
17 system_prompt="You are a helpful assistant.",
18)
19
20async def llm_app():
21 with trace(metric_collection="my_metric_collection"):
22 await agent.run("What is 3 * 12?")
23
24asyncio.run(llm_app())

All incoming traces will now be evaluated using metrics from your metric collection.

View on Confident AI

You can view the evals on Confident AI by clicking on the link in the output printed in the console.