Skip to main content

COMPEL Glossary / labeling

Labeling

Labeling (also called annotation) is the process of tagging data with correct answers to create training datasets for supervised learning.

What this means in practice

For a spam detection model, humans label thousands of emails as 'spam' or 'not spam.' For a medical imaging model, physicians annotate scans to identify pathological features. Labeling is often the most expensive and time-consuming part of an ML project because it requires human judgment, domain expertise, and quality control. The cost and difficulty of labeling should be a primary factor in use case evaluation: use cases where labeled data already exists in enterprise systems (customer churn outcomes, fraud results) are significantly cheaper to pursue than those requiring manual expert labeling from scratch. In the COMPEL Model stage, data readiness assessments must account for labeling requirements, costs, and timelines.

Why it matters

Labeling is often the most expensive and time-consuming part of an ML project because it requires human judgment, domain expertise, and quality control. Use cases where labeled data already exists in enterprise systems are significantly cheaper to pursue than those requiring manual expert labeling from scratch. Organizations that fail to account for labeling costs during planning face budget overruns and timeline delays that can derail AI projects.

How COMPEL uses it

In the COMPEL Model stage, data readiness assessments must account for labeling requirements, costs, and timelines. Use case evaluation during Model considers labeling feasibility as a factor in prioritization. During Produce, labeling workflows are established with quality controls. The Evaluate stage measures labeling efficiency and quality, and the Learn stage captures lessons about labeling approaches to improve planning accuracy in future cycles.

Related Terms

Other glossary terms mentioned in this entry's definition and context.