Introduction to Red Teaming | Confident AI Docs

Overview

Red Teaming on Confident AI is an adversarial testing platform for AI safety and security, and can be done in two ways:

No-code directly in the platform UI, best for security teams, PMs, and compliance officers, or,
Code-driven using the deepteam framework, best for engineers and AI red teamers.

The no-code workflow is the more comprehensive option — it leverages the platform’s framework builder to produce CVSS scores and full risk profiles. The code-driven workflow uses deepteam to orchestrate red teaming runs programmatically.

Red Teaming integrates seamlessly with your existing LLM evaluation and tracing workflows on Confident AI.

Key capabilities

Everything you need to operationalize AI red teaming:

Custom Frameworks

Ships with OWASP Top 10 for LLMs, NIST AI RMF, and more out of the box — or build your own with the framework builder.

CVSS Scoring

Get standardized vulnerability severity scoring for every risk assessment you run.

Compliance Reporting

Generate reports aligned to regulatory frameworks like EU AI Act, NIST, and OWASP Top 10 for LLMs.

CI/CD Integration

Integrate security testing into your deployment pipelines for continuous assessment.

Choose your workflow

Run risk assessments in the UI without any code or use deepteam programmatically:

No-Code Assessments

Full risk assessments with CVSS scoring
Custom framework builder for comprehensive coverage

Suitable for: Security teams, PMs, compliance officers

Red Team Using DeepTeam

Programmatic red teaming orchestration via deepteam
Custom vulnerability and attack development
Does not support cloud frameworks or CVSS scoring

Suitable for: Engineers, automated testing

Not sure which to pick?

Start with no-code — it gives you the most comprehensive assessment out of the box, including CVSS scores and the framework builder. Use code-driven when you need to orchestrate red teaming in CI/CD or develop custom attacks. Results from both workflows appear in the same dashboards.

What you can red team

Test your AI systems across all major vulnerability categories:

Prompt Injection & Jailbreaks

Probe for prompt manipulation, model exploitation, and adversarial bypass techniques across 50+ attack patterns.

Bias & Fairness

Assess outputs for bias across protected characteristics and demographic groups.

Content Safety

Evaluate for harmful, toxic, or inappropriate outputs including PII leakage and misinformation.

Learn the fundamentals

New to AI red teaming? These concepts will help you get the most out of your setup:

What is LLM red teaming? - Step by step guide to red teaming LLMs
Risk Profile — understand your AI system’s attack surface and top vulnerabilities
Frameworks & Policies — learn how to use AI safety frameworks managed on Confident AI
No-Code Quickstart — run your first risk assessment in the platform UI

How is AI Red Teaming different from traditional security testing?

AI Red Teaming targets vulnerabilities specific to AI systems — prompt injection, model poisoning, adversarial examples — rather than infrastructure or application-level security. See our frameworks guide for more detail.

What types of AI systems can be red teamed?

All types — conversational AI, RAG systems, multi-agent workflows, fine-tuned models, and AI-powered APIs. Each system type has tailored attack scenarios and evaluation criteria.

Do I need security expertise to use Red Teaming?

No. The platform provides automated risk assessments, pre-built attack scenarios, and step-by-step remediation guidance. Security expertise helps with advanced features, but isn’t required to get started.