COMPEL™ Body of Knowledge v2.5

COMPEL Glossary / continuous-batching

Continuous batching

An inference-server technique — popularised by vLLM and Text Generation Inference — that dynamically groups concurrent requests at the token-generation level to raise GPU utilisation.

What this means in practice

Distinct from static batching because batches are formed and reformed each iteration; central to making self-hosted LLM inference economically viable at scale.

Synonyms

dynamic batching , inference-time batching

See also

Serving pattern — The architectural shape of the inference path — managed API, cloud-platform hosted, self-hosted online, self-hosted batch, or edge.
Prompt caching — An inference optimisation that caches the attention key-value state for a prompt prefix so that subsequent requests sharing the same prefix skip re-processing.
TTFT (time-to-first-token) — The latency from request submission to the first streamed output token.

Cite this article

Author:: FlowRidge Team
Publisher:: FlowRidge
First Published:: 2026
Work:: COMPEL AI Transformation Body of Knowledge

Academic (APA)

FlowRidge Team. (2026). Continuous batching — COMPEL Glossary. COMPEL AI Transformation Body of Knowledge. FlowRidge. Retrieved from https://www.compelframework.org/glossary/continuous-batching

BibTeX

@misc{compel-continuous-batching-2026,
  author = {{FlowRidge Team}},
  title = {Continuous batching — COMPEL Glossary},
  howpublished = {COMPEL AI Transformation Body of Knowledge},
  publisher = {FlowRidge},
  year = {2026},
  url = {https://www.compelframework.org/glossary/continuous-batching},
  note = {Governed by the COMPEL Framework License Agreement}
}

Plain text

FlowRidge Team. Continuous batching — COMPEL Glossary. COMPEL AI Transformation Body of Knowledge. FlowRidge, 2026. https://www.compelframework.org/glossary/continuous-batching

Need Chicago, IEEE, or MLA formats? See the full COMPEL Citation Guide for every supported format with copy-ready snippets.

This content is part of the COMPEL AI Transformation Body of Knowledge, governed by the COMPEL Framework License Agreement. See /license for terms.