Skip to main content

COMPEL Glossary / online-evaluation

Online evaluation

Assessment of an AI system under live traffic using randomized or sequential experimental designs — A/B test, multi-armed bandit, canary, or interleaving.

What this means in practice

The only evaluation mode that measures true user-facing outcome; governance constraints include blast radius, reversibility, and regulatory exposure during the live test.

Synonyms

online test , live-traffic evaluation , production evaluation

See also

  • Offline evaluation — Assessment of an AI system against static datasets — training hold-out, validation set, benchmark corpus — without exposure to live user traffic.
  • Multi-armed bandit — An online experimentation strategy that shifts traffic toward better-performing variants during the experiment — trading statistical power for exploitation of early wins.

Related articles in the Body of Knowledge