COMPEL™ Body of Knowledge v2.5

COMPEL Glossary / llm-as-judge

LLM-as-judge

An evaluation technique using a large language model to score outputs from another LLM on quality dimensions — helpfulness, correctness, safety — scaling evaluation beyond human-rater capacity.

What this means in practice

Strengths: scalability, consistency. Weaknesses: judge-model biases, verbosity preference, and self-preference when the judge and candidate share architecture.

Synonyms

model-graded evaluation , LLM judge , judge model

See also

Evaluation harness — The infrastructure that runs capability, regression, safety, and human-review evaluations on an LLM feature on a defined cadence.
Benchmark contamination — The presence of benchmark test data in foundation-model training corpora — whether through web crawling or deliberate inclusion — inflating reported benchmark scores and breaking the comparability of benchmark results across models.
Red-team experiment — An adversarial experiment designed to probe failure modes rather than validate desired behavior — structured, hypothesis-driven exploration of safety bypass, goal mis-specification, jailbreak, and harm.

Related articles in the Body of Knowledge

LLM-as-Judge and Human Review Pipelines

Cite this article

Author:: FlowRidge Team
Publisher:: FlowRidge
First Published:: 2026
Work:: COMPEL AI Transformation Body of Knowledge

Academic (APA)

FlowRidge Team. (2026). LLM-as-judge — COMPEL Glossary. COMPEL AI Transformation Body of Knowledge. FlowRidge. Retrieved from https://www.compelframework.org/glossary/llm-as-judge

BibTeX

@misc{compel-llm-as-judge-2026,
  author = {{FlowRidge Team}},
  title = {LLM-as-judge — COMPEL Glossary},
  howpublished = {COMPEL AI Transformation Body of Knowledge},
  publisher = {FlowRidge},
  year = {2026},
  url = {https://www.compelframework.org/glossary/llm-as-judge},
  note = {Governed by the COMPEL Framework License Agreement}
}

Plain text

FlowRidge Team. LLM-as-judge — COMPEL Glossary. COMPEL AI Transformation Body of Knowledge. FlowRidge, 2026. https://www.compelframework.org/glossary/llm-as-judge

Need Chicago, IEEE, or MLA formats? See the full COMPEL Citation Guide for every supported format with copy-ready snippets.

This content is part of the COMPEL AI Transformation Body of Knowledge, governed by the COMPEL Framework License Agreement. See /license for terms.