LLM Tracing Quickstart
Overview
Confident AI allows anyone building with any framework, language, and LLMs to setup LLM observability through tracing. Every interaction traced is evals-native, powered by DeepEval. This means apart form tracking latency, cost, and error rates, you can also run evals on:
- Traces
- Spans, and
- Threads
The demo below demonstrates what it looks like on the platform:
Traces
Spans
Threads
Traces are a single execution of your LLM app, and running evals on traces is akin to the end-to-end evals for single-turn evaluation in development.
How It Works
You would setup LLM tracing either through one of our integrations, or by decorating your LLM app for Python and Typescript users. Either way, Confident AI will create an execution hierarchy of your LLM app, as well as log all components that were called for each LLM invocation:
- Trace: The overall process of tracking and visualizing the execution flow of your LLM application
- Span: Individual units of work within your application (e.g., LLM calls, tool executions, retrievals)
Each observed function CREATES A SPAN, and MANY SPANS MAKE UP A TRACE. When you have tracing setup, you can run evaluations on both the trace and span level.
Trace Your First LLM Call
You’ll need to get your API key as shown in the setup and installation section before continuing.
Python
TypeScript
Setup Tracing
The @observe decorator logs whatever it decorates within and is the primary way to instrument your LLM app for tracing.
✅ You just created a trace with a span inside it. Go to the Observatory to see your traces there.
If your llm_app has more than one function, simply decorate those functions with @observe too.
In a later section, you’ll learn how to create spans that are LLM specific, which allow you to log things like token cost and model name automatically.
Congratulations! 🎉 Now whenever you run your LLM app, all traces will be logged AND evaluated on Confident AI. Go to the the Observatory section on Confident AI to check it out.
If you don’t see the trace, it is 99.99% because your program exited before the traces had a chance to get posted. Try setting CONFIDENT_TRACE_FLUSH=YES if this is the case:
In the next section, we will learn how to run evals for LLM tracing.