Collect Feedback | Confident AI Docs

Overview

Confident AI allows you to collect feedback from end users that are interacting with you LLM app. End user feedback can be left on:

Traces
Spans, and
Threads

When you send an annotation of a user feedback, you’ll get the opportunity to incoporate them into a dataset.

User feedback can be ingested via the Evals API, or DeepEval for those using python or typescript.

How It Works

To collect feedback, you need to:

Setup a custom UI for users to enter their rating (thumbs up/down or 5 star system), and optionally expected outcome/output, and explanation
Either collect the trace UUID, span UUID, or thread ID you’d like to leave feedback for
Send the feedback to Confident AI via the Evals API

Since the thread ID is something you provide (click here if unsure) during LLM tracing, it is generally easier to setup feedback collection on threads than on traces and spans.

Collect Single-Turn Feedback

Get UUID from trace/span

Get the UUID of the trace or span you want to collect feedback for from the current trace or span context.

Python

Typescript

1 from openai import OpenAI
2 from deepeval.tracing import observe
3 from deepeval.tracing.context import current_trace_context
4 
5 client = OpenAI()
6 TRACE_UUID = None
7 
8 @observe()
9 def llm_app(query: str) -> str:
10     return client.chat.completions.create(
11         model="gpt-4o",
12         messages=[{"role": "user", "content": query}]
13     ).choices[0].message.content
14 
15     current_trace = current_trace_context.get()
16     TRACE_UUID = current_trace.uuid
17     return

You’ll need to find a way to save the UUIDs somewhere to use it later.

Send annotation for thread

In a separate workflow, post the feedback to the Evals API using the UUIDs you collected.

Python

Typescript

1 from deepeval.annotation import send_annotation
2 
3 send_annotation(
4     trace_uuid=TRACE_UUID,
5     rating=1,
6     # span_uuid=SPAN_UUID, # you can only set trace_uuid or span_uuid
7 )

You can send either a thumbs up/down rating or a 5 star rating.

Collect Multi-Turn Feedback

Setup thread ID

Define a thread ID and configure your traced LLM app to associate all related traces to this thread.

Python

Typescript

main.py

1 from openai import OpenAI
2 from deepeval.tracing import observe
3 
4 client = OpenAI()
5 THREAD_ID = "YOUR-THREAD-ID"
6 
7 @observe()
8 def llm_app(query: str) -> str:
9     response = client.chat.completions.create(
10         model="gpt-4o",
11         messages=[{"role": "user", "content": query}]
12     )
13     update_current_trace(thread_id=THREAD_ID)
14     return response.choices[0].message.content

Since thread IDs are user-defined, you just set one when you start the conversation and reuse it across calls.

Send annotation for trace/span

Post the thread-level feedback to the Evals API using the thread IDs you defined.

Python

Typescript

1 from deepeval.annotation import send_annotation
2 
3 send_annotation(
4     thread_id=THREAD_ID,
5     rating=1,
6 )

As with the single-turn feedback, you can send either a thumbs up/down rating or a 5 star rating.