Strands Agents

Use Confident AI for LLM observability and evals for Strands Agents

Overview

Strands Agents is an open-source SDK from AWS for building and running AI agents. It emits OpenTelemetry spans natively using the OTel GenAI semantic conventions, making it straightforward to integrate with Confident AI for real-time tracing and evaluation.

The integration works via OpenTelemetry: instrument_strands() registers a StrandsSpanInterceptor and a ContextAwareSpanProcessor on the global TracerProvider. Strands’ built-in tracer picks up the provider automatically — so you only need to call instrument_strands() once before creating your Agent, and all spans flow to Confident AI in real time.

Tracing Quickstart

For users in the EU region, please set the OTEL endpoint to the EU version as shown below:

$export CONFIDENT_OTEL_URL="https://eu.otel.confident-ai.com"
1

Install Dependencies

Run the following command to install the required packages:

$pip install -U deepeval strands-agents opentelemetry-sdk opentelemetry-exporter-otlp-proto-http
2

Instrument Strands

Call instrument_strands once at startup, before creating your Agent. It registers deepeval’s processors on the global TracerProvider so Strands’ built-in tracer picks them up automatically.

main.py
1from strands import Agent
2from deepeval.integrations.strands import instrument_strands
3
4instrument_strands()
5
6agent = Agent(model="us.amazon.nova-lite-v1:0")
7result = agent("Explain OpenTelemetry in one sentence.")
8print(result.message)

Strands emits OTel GenAI semantic convention attributes natively (gen_ai.user.message, gen_ai.choice, gen_ai.usage.input_tokens, gen_ai.operation.name, etc.), so the integration captures agent, LLM, and tool spans automatically with no extra configuration.

3

Run your agent

Execute the script to send traces to Confident AI:

$python main.py

You can directly view the traces on Confident AI by clicking on the link in the output printed in the console.

What Gets Captured

The Strands integration automatically extracts the following data from each span:

Span TypeData Captured
AgentAgent name, input message, output message, tool calls made
LLMModel name, provider, input/output messages, token counts (input + output)
ToolTool name, input parameters, output

Token counts are read from gen_ai.usage.input_tokens / gen_ai.usage.output_tokens (and the prompt_tokens / completion_tokens aliases). The LLM provider is inferred from the model name, or read directly from gen_ai.response.provider when available.

Advanced Usage

Logging threads

Threads group related traces together and are useful for chat apps, agents, or any multi-turn interactions. You can learn more about threads here. Pass thread_id to instrument_strands.

main.py
1from strands import Agent
2from deepeval.integrations.strands import instrument_strands
3
4instrument_strands(
5 thread_id="conversation-abc123",
6 user_id="user_1",
7)
8
9agent = Agent(model="us.amazon.nova-lite-v1:0")
10result = agent("What's the capital of France?")
11print(result.message)

Strands supports passing custom attributes via trace_attributes={"session.id": "..."} when creating an agent. The integration automatically promotes session.id to thread_id when no explicit thread_id is provided.

Trace attributes

All trace-level attributes are optional and apply to every trace produced while the instrumentation is active.

main.py
1from strands import Agent
2from deepeval.integrations.strands import instrument_strands
3
4instrument_strands(
5 name="My Strands Agent",
6 tags=["production", "v2"],
7 metadata={"region": "us-east-1"},
8 user_id="user_1",
9 thread_id="conversation-abc123",
10 environment="production",
11)
12
13agent = Agent(model="us.amazon.nova-lite-v1:0")
14result = agent("Summarize the latest AI news.")
15print(result.message)
api_key
str

Your Confident AI API key. Defaults to the CONFIDENT_API_KEY environment variable when omitted.

name
str

The name of the trace. Learn more.

tags
List[str]

Tags are string labels that help you group related traces. Learn more.

metadata
Dict

Attach any metadata to the trace. Learn more.

thread_id
str

Supply the thread or conversation ID to view and evaluate conversations. Learn more.

user_id
str

Supply the user ID to enable user analytics. Learn more.

turn_id
str

The turn ID for multi-turn conversations.

test_case_id
str

Associate this trace with a specific test case ID.

metric_collection
str

The name of the metric collection to use for online evals at the trace level.

environment
str

The deployment environment. Accepted values: "production", "staging", "development", "testing". Defaults to "development".

Each attribute is optional, and works the same way as the native tracing features on Confident AI.

Logging prompts

If you are managing prompts on Confident AI and wish to log them, use next_llm_span to associate a Prompt with the next LLM span before invoking your agent.

main.py
1from strands import Agent
2from deepeval.prompt import Prompt
3from deepeval.tracing import next_llm_span
4from deepeval.integrations.strands import instrument_strands
5
6instrument_strands(environment="production")
7
8agent = Agent(model="us.amazon.nova-lite-v1:0")
9
10prompt = Prompt(alias="<prompt-alias>")
11prompt.pull(version="00.00.01")
12
13with next_llm_span(prompt=prompt):
14 result = agent(prompt.interpolate())
15print(result.message)

Be sure to pull the prompt before logging it, otherwise the prompt will not be visible on Confident AI.

Using with tools

The integration captures tool spans automatically. Define tools with the @tool decorator and pass them to your Agent as usual.

main.py
1from strands import Agent
2from strands.tools import tool
3from deepeval.integrations.strands import instrument_strands
4
5instrument_strands(environment="production")
6
7@tool
8def get_weather(city: str) -> str:
9 """Get the current weather for a city."""
10 return f"The weather in {city} is sunny and 22°C."
11
12agent = Agent(
13 model="us.amazon.nova-lite-v1:0",
14 tools=[get_weather],
15)
16
17result = agent("What's the weather like in Paris?")
18print(result.message)

Each tool invocation produces a tool span with the tool name and input parameters captured automatically.

Using with observe

When instrument_strands is called inside an active deepeval @observe / with trace(...) context, Strands’ OTel spans are stitched into the enclosing deepeval trace. update_current_span(...) and update_current_trace(...) work anywhere in the call stack.

main.py
1from strands import Agent
2from deepeval import observe
3from deepeval.integrations.strands import instrument_strands
4
5instrument_strands()
6
7agent = Agent(model="us.amazon.nova-lite-v1:0")
8
9@observe(name="my-app")
10def run_pipeline(prompt: str) -> str:
11 # Strands spans are stitched into this @observe trace automatically
12 result = agent(prompt)
13 return result.message

Evals Usage

Online evals

You can run online evals on your Strands agent, which will run evaluations on all incoming traces on Confident AI’s servers. This approach is recommended if your agent is in production.

1

Create metric collection

Create a metric collection on Confident AI with the metrics you wish to use to evaluate your Strands agent.

Create metric collection

Your metric collection must only contain metrics that evaluate the input and actual output of the component it is assigned to.

2

Run evals

Pass the metric_collection parameter to instrument_strands to run online evals at the trace level. For span-level evals, use update_current_span(metric_collection=...) inside your agent code.

main.py
1from strands import Agent
2from deepeval.integrations.strands import instrument_strands
3
4instrument_strands(
5 metric_collection="my-trace-collection",
6 environment="production",
7)
8
9agent = Agent(model="us.amazon.nova-lite-v1:0")
10result = agent("Summarize the latest AI news.")
11print(result.message)

We recommend creating separate metric collections for each component (trace, agent span, LLM span, tool span), since each requires its own evaluation criteria and metrics.

All incoming traces will now be evaluated using metrics from your metric collection.

You can view evals on Confident AI by clicking on the link in the output printed in the console.