COMPEL Glossary / overfitting
Overfitting
Overfitting occurs when an AI model learns the training data too precisely -- memorizing specific examples including their noise and anomalies rather than learning generalizable patterns -- and consequently performs poorly on new, unseen data.
What this means in practice
An overfitted model might achieve 99% accuracy on training data but only 70% on production data, because it learned to recognize specific training examples rather than the underlying patterns they represent. Overfitting is a common risk when models are too complex relative to the available training data, when training runs too long, or when validation practices are inadequate. Detection and prevention techniques include cross-validation, holdout test sets, regularization, early stopping, and careful model complexity selection. For transformation leaders, overfitting risk underscores why model validation must use independent data that the model never saw during training.
Why it matters
Overfitting causes AI models to perform brilliantly on test data but fail in production, creating a false sense of capability that leads to poor business decisions. A model achieving 99% training accuracy but only 70% production accuracy delivers unreliable results that erode stakeholder trust. Understanding overfitting risk ensures organizations demand proper validation practices before deploying AI systems.
How COMPEL uses it
During the Model stage, COMPEL's use case evaluation includes technical feasibility assessment where overfitting risk is identified based on data volume and model complexity. The Evaluate stage requires independent validation using data the model never saw during training. The Technology pillar's MLOps maturity assessment (Domain 7) includes cross-validation and holdout testing as standard practices.
Related Terms
Other glossary terms mentioned in this entry's definition and context.