Crew AI
Overview
CrewAI is a lean, lightning-fast Python framework for creating autonomous AI agents tailored to any scenario. Confident AI allows you to trace and evaluate CrewAI workflows with just a single line of code.
Tracing Quickstart
You can directly view the traces on Confident AI by clicking on the link in the output printed in the console.
Advanced Usage
Logging threads
Threads are used to group related traces together, and are useful for chat apps, agents, or any multi-turn interactions. You can learn more about threads here. Set the thread_id in the trace context and call crew.kickoff within the context.
Logging metadata
You can also set the metadata in the trace context.
Other trace attributes
Additionally, you can set the name, tags and user_id in the trace context.
View Trace Attributes
The name of the trace. Learn more.
Tags are string labels that help you group related traces. Learn more.
Attach any metadata to the trace. Learn more.
Supply the thread or conversation ID to view and evaluate conversations. Learn more.
Supply the user ID to enable user analytics. Learn more.
Each attribute is optional, and works the same way as the native tracing features on Confident AI.
Evals Usage
Online evals
You can run online evals on your OpenAI Agent, which will run evaluations on all incoming traces on Confident AI’s servers. This is the recommended approach, especially if your agent is in production.
Create metric collection
Create a metric collection on Confident AI with the metrics you wish to use to evaluate your OpenAI Agent.
Click to see supported metrics for OpenAI Agents
Confident AI supports evaluating the input-output pairs of OpenAI Agent spans and traces, which means your metric collections must only contain metrics that only require the input and output for evaluation. These metrics include:
If you’re looking to use other metrics, setup Confident AI’s native tracing instead.
Run evals
Run evaluations on the various components of your CrewAI application by setting the metric_collection to the DeepEval’s wrapper for CrewAI.
The current CrewAI integration supports metrics with parameters that evaluate input and actual output in addition to the Task Completion metric.
Trace
Crew Span
Agent Span
LLM Span
To evaluate the trace level, set the trace_metric_collection to the DeepEval’s trace context.
All incoming traces and spans will now be evaluated using metrics from your metric collection.