Skip to main content

COMPEL Glossary / red-teaming

Red Teaming

Red teaming is a security and safety testing practice where a dedicated team deliberately attempts to find vulnerabilities, trigger unsafe behavior, or exploit weaknesses in an AI system.

What this means in practice

Red teams design adversarial scenarios, craft inputs to elicit harmful outputs, test boundary compliance, and evaluate whether safety mechanisms can be circumvented. For agentic AI systems, red teaming extends to testing tool misuse, unauthorized escalation, and boundary violations. Red teaming is not a one-time activity -- it should be repeated as AI capabilities evolve, tool access changes, and new attack vectors emerge. In the COMPEL framework, red teaming is recommended as part of the Evaluate stage for all high-risk AI systems and is mandatory for agents at Level 3 autonomy and above.

Why it matters

Red teaming exposes AI vulnerabilities that standard testing cannot detect, including adversarial manipulations, safety boundary circumventions, and unexpected failure modes. For agentic AI systems, red teaming is even more critical because agents can take autonomous actions with real-world consequences. Organizations that skip red teaming deploy systems with unknown vulnerabilities that adversaries will eventually discover and exploit.

How COMPEL uses it

COMPEL recommends red teaming as part of the Evaluate stage for all high-risk AI systems and mandates it for agents at Level 3 autonomy and above. During the Model stage, red team scope and methodology are defined as part of the testing strategy. The Agent Governance cross-cutting layer specifies red teaming requirements that escalate with autonomy level, and red team findings feed directly into risk register updates.

Related Terms

Other glossary terms mentioned in this entry's definition and context.