Skip to main content

COMPEL Glossary / data-classification

Data Classification

Data classification is the process of categorizing data based on its sensitivity level, regulatory requirements, and business criticality into tiers such as public, internal, confidential, and restricted.

What this means in practice

Each classification tier carries specific requirements for handling, storage, access control, encryption, retention, and disposal. For organizations training AI models, data classification determines which datasets can be used for model training, what protections must be in place, and how the resulting models can be deployed and shared. In COMPEL, data classification is a foundational governance control assessed during the Calibrate stage and is part of the data governance framework designed during Model, ensuring that AI projects receive data with appropriate protections and that classification decisions are documented for audit purposes.

Why it matters

Data classification is the gatekeeper that determines which datasets can be used for AI training and under what protections. Without it, organizations face regulatory fines for mishandling sensitive data or, conversely, overly restrict access and starve AI projects of the data they need. Clear classification tiers enable faster, confident decision-making about data usage while maintaining compliance with privacy regulations.

How COMPEL uses it

COMPEL assesses data classification maturity during Calibrate as a foundational governance control under the Governance pillar (Domain 14-18). During Model, classification schemes are designed as part of the data governance framework, ensuring AI projects receive data with appropriate protections. The Produce stage operationalizes classification policies, and Evaluate audits compliance with classification requirements across deployed systems.

Related Terms

Other glossary terms mentioned in this entry's definition and context.