OpenAI
Overview
Confident AI lets you trace and evaluate OpenAI calls, whether standalone or used as a component within a larger application.
Tracing Quickstart
Configure OpenAI
To begin tracing your OpenAI calls as a component in your application, import OpenAI from DeepEval instead.
Chat Completions
Responses
Async Chat Completions
Async Responses
DeepEval’s OpenAI client traces chat.completions.create, beta.chat.completions.parse, and responses.create methods.
Run OpenAI
Invoke your agent by executing the script:
You can directly view the traces on Confident AI by clicking on the link in the output printed in the console.
Advanced Usage
Logging prompts
If you are managing prompts on Confident AI and wish to log them, pass your Prompt object to the trace context.
Logging threads
Threads are used to group related traces together, and are useful for chat apps, agents, or any multi-turn interactions. Learn more about threads here. You can set the thread_id in the trace context.
Other trace attributes
Confident AI’s LLM tracing advanced features provide teams with the ability to set certain attributes for each trace when invoking your OpenAI client.
For example, user_id can be used to enable user analytics. You can learn more about user id here. Similarly, you can set the metadata to attach any metadata to the trace.
You can set these attributes in the trace context when invoking your OpenAI client.
This override the trace attributes that were set using update_current_trace method.
View Trace Attributes
The name of the trace. Learn more.
Tags are string labels that help you group related traces. Learn more.
Attach any metadata to the trace. Learn more.
Supply the thread or conversation ID to view and evaluate conversations. Learn more.
Supply the user ID to enable user analytics. Learn more.
Each attribute is optional, and works the same way as the native tracing features on Confident AI.
Evals Usage
Online evals
If your OpenAI application is in production, and you still want to run evaluations on your traces, use online evals. It lets you run evaluations on all incoming traces on Confident AI’s server.
Create metric collection
Create a metric collection on Confident AI with the metrics you wish to use to evaluate your OpenAI agent. Copy the name of the metric collection.
End-to-end evals
Confident AI allows you to run end-to-end evals on your OpenAI client to evaluate your OpenAI calls directly. This is recommended if you are testing your OpenAI calls in isolation.
Create metric
You can only run end-to-end evals on OpenAI using metrics that evaluate
input, output, or tools_called. You can pass parameters like expected_output, expected_tools, context and retrieval_context to the trace context.
Run evals
Replace your OpenAI client with DeepEval’s. Then, use the dataset’s evals_iterator to invoke your OpenAI client for each golden.
Chat Completions
Responses
Async Chat Completions
Async Responses
This will automatically generate a test run with evaluated OpenAI traces using inputs from your dataset.
Using OpenAI in component-level evals
You can also evaluate OpenAI calls through component-level evals. This approach is recommended if you are testing your OpenAI calls as a component in a larger application system.