LLM Tracing Quickstart
Overview
This guide shows you how to instrument your LLM app using the @observe decorator for Python or the observe wrapper for TypeScript.
Prefer one-line integrations or OpenTelemetry? You can also instrument your app via integrations for OpenAI, LangChain, and more or OpenTelemetry (OTEL) for any language — no decorator changes needed.
How it works
Tracing works through instrumentation, which can either be manual or through one of Confident AI’s integrations:
- Decorate or wrap your functions with
@observe(Python) orobserve(TypeScript) - Each observed function becomes a span
- The outermost observed function becomes the trace — all nested spans roll up into it (see troubleshooting if spans are creating separate traces instead of nesting)
- Traces are sent to Confident AI asynchronously with zero latency impact
- Once ingested, traces can be evaluated automatically using your configured metrics
You should also understand the terminology for tracing:
A single end-to-end execution of your LLM app — the top-level unit of observability.
An individual component within a trace, such as an LLM call, retrieval, or tool execution.
A group of traces representing a multi-turn conversation, linked by a shared thread ID.
Instrument Your AI App
You’ll need to get your API key as shown in the setup and installation section before continuing.
Install DeepEval
Instrumentation must be done via code, so first install DeepEval, Confident AI’s official open-source SDK:
Python
TypeScript
Instrument Your App
Decorate or wrap your functions to automatically capture inputs, outputs, and execution flow. Note that each observe decorator/wrapper creates a span on the UI.
Python
TypeScript
Done ✅. You just created a trace with a span inside it. Go to the Observatory to see your traces there.
If you don’t see the trace, it is 99.99% because your program exited before the traces had a chance to get posted. Try setting CONFIDENT_TRACE_FLUSH=1 if this is the case:
See the troubleshooting page for more details.
In a later section, you’ll learn how to create spans that are LLM specific, which allow you to log things like token cost and model name automatically.
Update traces & spans
Once inside an observed function, you can enrich the current trace or span with additional data using update_current_trace / update_current_span (Python) or updateCurrentTrace / updateCurrentSpan (TypeScript).
Python
TypeScript
update_current_trace/updateCurrentTracesets data on the trace (the outermost observed function) — use it for input/output, tags, metadata, threads, and users.update_current_span/updateCurrentSpansets data on the current span — use it for span-level input/output, metadata, and online eval test case parameters.
Both can be called multiple times from anywhere inside an observed function — values are merged, with later calls overriding earlier ones. Make sure to use the right one — see update_current_trace vs update_current_span in the troubleshooting page.
Using context manager
For pyhton users, if you prefer not to use the @observe decorator, DeepEval also supports the Observer context manager with the same arguments:
As you learn more about the @observe decorator later on - you can rest assured that everything will automatically apply to context managers as well. This is useful when you can’t modify a function’s definition or need to instrument a specific code block rather than an entire function.
Instrument Multi-Turn Apps
If your app handles conversations or multi-turn interactions, you can group traces into a thread by providing a thread ID. Each call to your app creates a trace, and traces with the same thread ID are grouped together as a conversation.
Python
TypeScript
The thread ID can be any string (e.g., a session ID from your app). The input and output is recommended to be the raw user text and LLM response respectively — Confident AI uses these as the conversation turns for display and thread evaluations.
For more details on thread I/O conventions, tools called, retrieval context, and running offline evals on threads, see the full Threads page.
Next steps
Now that you’ve learnt the very basics of instrumenting your AI app, dive deeper into: