The COMPEL Glossary Graph visualizes relationships between framework terminology, showing how concepts interconnect across domains, stages, and pillars. Term nodes cluster by pillar affiliation while cross-references reveal semantic dependencies — for example, how risk appetite connects to control effectiveness, model governance, and assurance requirements. This network representation helps practitioners navigate the framework vocabulary and understand that COMPEL terminology forms a coherent conceptual system rather than isolated definitions.
COMPEL Glossary / continuous-batching
Continuous batching
An inference-server technique — popularised by vLLM and Text Generation Inference — that dynamically groups concurrent requests at the token-generation level to raise GPU utilisation.
What this means in practice
Distinct from static batching because batches are formed and reformed each iteration; central to making self-hosted LLM inference economically viable at scale.
Synonyms
dynamic batching , inference-time batching
See also
- Serving pattern — The architectural shape of the inference path — managed API, cloud-platform hosted, self-hosted online, self-hosted batch, or edge.
- Prompt caching — An inference optimisation that caches the attention key-value state for a prompt prefix so that subsequent requests sharing the same prefix skip re-processing.
- TTFT (time-to-first-token) — The latency from request submission to the first streamed output token.