Toxicity | Confident AI Docs

Overview

The toxicity metric is a single-turn safety metric that uses LLM-as-a-judge to assess whether your LLM application’s output contains toxic statements.

The toxicity metric is a referenceless metric, which means it only needs the actual output of your test case and does not depend any other information.

Required Parameters

These are the parameters you must supply in your test case to run evaluations for toxicity metric:

input

stringRequired

The input you supplied to your LLM application.

actual_output

stringRequired

The final output your LLM application generates.

How Is It Calculated?

The toxicity metric uses an LLM to extract independent opinions from the actual output, then uses the same LLM to count how many of those opinions contain toxic content.

\text{Toxicity} = \frac{\text{Number of Toxic Opinions}}{\text{Total Number of Opinions}}

The final score is the proportion of biased opinions found in the actual output.

Create Locally

You can create the ToxicityMetric in deepeval as follows:

1 from deepeval.metrics import ToxicityMetric
2 
3 metric = ToxicityMetric()

Here’s a list of parameters you can configure when creating a ToxicityMetric:

threshold

numberDefaults to 0.5

A float representing the maximum passing threshold.

Unlike other metrics, the threshold for the ToxicityMetric is a maximum instead of a minimum threshold.

model

string | ObjectDefaults to gpt-4.1

A string specifying which of OpenAI’s GPT models to use OR any custom LLM model of type DeepEvalBaseLLM.

include_reason

booleanDefaults to true

A boolean to enable the inclusion a reason for its evaluation score.

async_mode

booleanDefaults to true

A boolean to enable concurrent execution within the measure() method.

strict_mode

booleanDefaults to false

A boolean to enforce a binary metric score: 0 for perfection, 1 otherwise.

verbose_mode

booleanDefaults to false

A boolean to print the intermediate steps used to calculate the metric score.

This can be used for both single-turn E2E and component-level testing.

Create Remotely

For users not using deepeval python, or want to run evals remotely on Confident AI, you can use the toxicity metric by adding it to a single-turn metric collection. This will allow you to use toxicity metric for:

Single-turn E2E testing
Single-turn component-level testing
Online and offline evals for traces and spans