Collect Feedback

Incorperate real user feedback into your evaluation pipeline

Overview

Confident AI allows you to collect feedback from end users that are interacting with you LLM app. End user feedback can be left on:

  • Traces
  • Spans, and
  • Threads

When you send an annotation of a user feedback, you’ll get the opportunity to incoporate them into a dataset.

User feedback can be ingested via the Evals API, or DeepEval for those using python or typescript.

How It Works

To collect feedback, you need to:

  • Setup a custom UI for users to enter their rating (thumbs up/down or 5 star system), and optionally expected outcome/output, and explanation
  • Either collect the trace UUID, span UUID, or thread ID you’d like to leave feedback for
  • Send the feedback to Confident AI via the Evals API

Since the thread ID is something you provide (click here if unsure) during LLM tracing, it is generally easier to setup feedback collection on threads than on traces and spans.

Collect Single-Turn Feedback

1

Get UUID from trace/span

Get the UUID of the trace or span you want to collect feedback for from the current trace or span context.

1from openai import OpenAI
2from deepeval.tracing import observe
3from deepeval.tracing.context import current_trace_context
4
5client = OpenAI()
6TRACE_UUID = None
7
8@observe()
9def llm_app(query: str) -> str:
10 return client.chat.completions.create(
11 model="gpt-4o",
12 messages=[{"role": "user", "content": query}]
13 ).choices[0].message.content
14
15 current_trace = current_trace_context.get()
16 TRACE_UUID = current_trace.uuid
17 return

You’ll need to find a way to save the UUIDs somewhere to use it later.

2

Send annotation for thread

In a separate workflow, post the feedback to the Evals API using the UUIDs you collected.

1from deepeval.annotation import send_annotation
2
3send_annotation(
4 trace_uuid=TRACE_UUID,
5 rating=1,
6 # span_uuid=SPAN_UUID, # you can only set trace_uuid or span_uuid
7)
You can send either a thumbs up/down rating or a 5 star rating.

Collect Multi-Turn Feedback

1

Setup thread ID

Define a thread ID and configure your traced LLM app to associate all related traces to this thread.

main.py
1from openai import OpenAI
2from deepeval.tracing import observe
3
4client = OpenAI()
5THREAD_ID = "YOUR-THREAD-ID"
6
7@observe()
8def llm_app(query: str) -> str:
9 response = client.chat.completions.create(
10 model="gpt-4o",
11 messages=[{"role": "user", "content": query}]
12 )
13 update_current_trace(thread_id=THREAD_ID)
14 return response.choices[0].message.content

Since thread IDs are user-defined, you just set one when you start the conversation and reuse it across calls.

2

Send annotation for trace/span

Post the thread-level feedback to the Evals API using the thread IDs you defined.

1from deepeval.annotation import send_annotation
2
3send_annotation(
4 thread_id=THREAD_ID,
5 rating=1,
6)

As with the single-turn feedback, you can send either a thumbs up/down rating or a 5 star rating.