COMPEL Glossary / data-lineage
Data Lineage
Data lineage is the documented, traceable history of a piece of data as it moves through an organization's systems, recording where it originated, how it was collected, what transformations were applied, where it was stored, who accessed it, and how it was ultimately used in AI models or business processes.
What this means in practice
For AI systems, data lineage is essential for debugging model issues (tracing unexpected predictions back to specific data sources), regulatory compliance (demonstrating that data was collected and used lawfully), and governance (ensuring training data meets quality and consent requirements). In COMPEL, data lineage capability is assessed during Calibrate under both the Technology and Governance pillars, and lineage infrastructure is designed during Model as part of the data architecture specified in Module 3.3.
Why it matters
When an AI model produces an unexpected prediction, data lineage enables teams to trace the problem back to its source, whether that is a corrupted data feed, an upstream transformation error, or a training data quality issue. Without lineage, debugging AI systems becomes guesswork. Regulators increasingly require demonstrable data provenance as evidence that AI decisions are based on lawful, properly handled information.
How COMPEL uses it
Data lineage capability is assessed during Calibrate under both the Technology pillar (infrastructure for tracking) and Governance pillar (policies requiring tracking). During Model, lineage infrastructure is designed as part of the data architecture specified in Module 3.3. The Produce stage implements lineage tooling, and the Evaluate stage uses lineage records as audit evidence to verify governance compliance.
Related Terms
Other glossary terms mentioned in this entry's definition and context.