Tool-Use Governance and Excessive Agency

FlowRidge

Tool-Use Governance — Defence Against Excessive Agency

Allow-list

Tools the agent may call

Signed registryScope tags

Parameter validation

Range + type + scope

SchemaSanitiser

Rate + budget

Calls and cost caps

Per turnPer session

Audit + kill switch

Trace + break-glass

Full logHalt signal

Figure 336. Excessive agency is the agent equivalent of privilege escalation. Every tool call must pass through four layers of governance before execution.

COMPEL Specialization — AITM-AAG: Agentic AI Governance Associate Article 6 of 14

Definition. Tool-use governance is the design and enforcement of constraints on which tools an agent may call, with what parameters, under what authorisation, at what rate, at what cost, and with what validation applied to results. Excessive agency, catalogued as LLM06 in the OWASP Top 10 for LLM Applications (2025 edition), is the failure mode in which the agent’s tool surface, permissions, or latitude exceeds what supervision and validation can safely cover. Source: https://genai.owasp.org/llm-top-10/.

The two concepts are inverses. Tool-use governance is what prevents excessive agency. In practice, every agentic incident examined by the OWASP Agentic AI Threats and Mitigations working group (2024) traces back to a tool-surface, permission, or validation decision that was either too broad, too loose, or too trusting. The controls below are the practical countermeasures.

The tool-use control categories

Six control categories, applied in combination, produce safe tool use. Absent any one of them, the agent is exposed.

Category 1 — Allow-list of tools

An agent must only be able to invoke tools on an explicit allow-list. The list is per agent identity, not per model. The same underlying language model (GPT-class, Claude-class, Gemini, Llama, Mistral) may have a narrow allow-list when deployed as a customer-service agent and a broad allow-list when deployed as a research assistant; the model does not determine the list.

The allow-list is maintained in code or configuration and is a governed artifact. Changes to the list trigger reclassification (Article 3). Tools added informally — for testing, for a temporary feature — must be removed explicitly, not forgotten.

Category 2 — Least-privilege permission scope per tool

Each tool is invoked with the least privilege necessary for its task. An email-send tool that an agent uses to send customer acknowledgements should not be invocable against arbitrary addresses with arbitrary subject lines. The tool’s parameters should be constrained — to a specific sender, a specific domain list, specific templates — so that the agent cannot exceed the intended use by varying the input.

Scope is enforced at the tool wrapper, not in the prompt. Relying on the prompt to tell the agent “don’t send to external domains” is a prompt-injection vulnerability. The wrapper rejects any parameter outside its scope; the agent may ask, but the system answers no.

Category 3 — Parameter schema and validation

Every tool has a schema for its parameters. Calls that don’t conform are rejected. The schema also enforces business-rule validation — a refund tool might reject refund amounts above a per-transaction cap or require a reference to an existing order. Validation happens at the wrapper, not inside the agent’s reasoning.

Parameter-schema validation is the specific defence against the class of attacks sometimes called parameter exfiltration, where the agent is induced to encode stolen data into tool parameters that are visible to an adversary. If the schema is tight, there is no place to hide the exfiltrated payload.

Category 4 — Rate and cost caps

Tools have per-agent, per-session, and per-day rate limits. An agent that can invoke a tool one hundred times per second can exhaust a downstream system’s capacity, exhaust a budget, or cause cascading incidents. Caps are enforced at the wrapper.

Cost caps apply to tools that carry direct monetary consequences (model provider API calls, paid third-party services, transactional tools). The AutoGPT incidents of 2023 (covered in Article 1) were in part a cost-cap failure: early implementations had no per-session budget and could drain an API quota in minutes.

Category 5 — Result sanitisation

The output of a tool, before it re-enters the agent’s context, is sanitised. Three sanitisations are standard:

Content-type enforcement. The output matches the declared content type (no markdown-injected HTML, no unexpected tool-call instructions).
Injection defence. Tool outputs are checked for patterns that look like they want to take control of the agent — instructions, system-prompt overrides, new tool-call directives.
Sensitive-data redaction. Tool outputs that may contain PII or confidential data are redacted or tokenised before the agent sees them.

The 2023 security research on ChatGPT plugin-based data exfiltration by Johann Rehberger demonstrated the attack class that sanitisation defends against. In the demonstrated attack, a document opened by a tool contained an indirect prompt injection that caused the agent to exfiltrate conversation data via a follow-on tool call. Source: https://embracethered.com/blog/posts/2023/chatgpt-webpilot-data-exfil-via-markdown-injection/. The attack class remains relevant across any agent with document-reading tools, regardless of which model or framework is in use.

Category 6 — Audit log

Every tool invocation — the agent identity, the tool, the parameters, the result, the timestamp, the session identifier — is logged to an audit store the agent cannot modify. The log is read-only to the agent; it is the observability spine (Article 10) and the first resource in an incident (Article 11).

The log format is stable across tool implementations. When an organisation changes orchestration frameworks — say from a LangGraph implementation to a CrewAI or AutoGen or OpenAI Agents SDK implementation — the audit log format is the stable layer; the framework choice should not invalidate months of historical data.

Excessive agency — the OWASP LLM06 pattern

The OWASP Top 10 for LLM Applications, in its 2025 edition, retains Excessive Agency as LLM06. The entry describes three sub-patterns that together make up the failure mode:

Excessive functionality. The agent has tools it does not need for its intended purpose.
Excessive permissions. The tools the agent has are invoked with privileges beyond what the task requires.
Excessive autonomy. The agent acts without a human check on high-consequence actions.

Each sub-pattern maps to one or more of the control categories above. Excessive functionality is a Category 1 failure (the allow-list is too broad). Excessive permissions are a Category 2 failure (least-privilege scoping was not applied). Excessive autonomy is a Category 5 of the previous article — human oversight (Article 5) — combined with Category 6 here (audit was not wired to intervene).

A governance review that walks the six categories above for each agent is, in effect, an LLM06 audit. The review’s output is either a green light or a named set of deficiencies with remediation owners.

The Chevrolet of Watsonville pattern

In December 2023, a user engaged with the Chevrolet of Watsonville dealership’s website chatbot and induced it to commit to selling a Chevrolet Tahoe for one dollar, complete with a promise that the offer was legally binding. The dealership did not honour the offer. The incident was widely covered and became a public touchstone for the risks of loosely scoped customer-service agents. Source: https://www.businessinsider.com/car-dealership-chatgpt-goes-rogue-2023-12.

The incident illustrates several of the control categories at once. The agent was scoped too broadly — its allow-list should not have included “commit to a sale price.” The parameter validation was absent — there was no business-rule check preventing a nonsensical price. The result sanitisation was absent — nothing in the chatbot’s output pipeline flagged an offer outside any plausible bound. The audit log was relevant after the fact, but the absence of runtime controls meant the incident reached the public before anyone inside the dealership knew about it. Any one of the six categories, if present, would have prevented the outcome.

Tool-use governance across frameworks

The control categories are framework-neutral. The table below names where each is typically implemented; the exact file layout varies.

Category	LangGraph pattern	OpenAI Agents SDK pattern	CrewAI pattern	AutoGen pattern	Hand-built on any OSS model
Allow-list	Tool set per graph	Tool list per Agent	Tool list per Agent	Function map per agent	Registry module
Least-privilege	Custom tool wrapper	Custom function defs	Custom tool wrapper	Custom function defs	Wrapper functions
Parameter validation	Pydantic / JSONSchema	Pydantic	Pydantic	Pydantic	Whatever the team chose
Rate/cost caps	Middleware	Middleware	Middleware	Middleware	Cross-cutting concern
Result sanitisation	Post-tool hook	Post-tool hook	Post-tool hook	Post-tool hook	Cross-cutting
Audit log	External sink	External sink	External sink	External sink	External sink

The external-sink pattern for audit logs — a dedicated log store outside the orchestration framework — is recommended across every implementation. Framework-internal logs are useful for debugging but are not sufficient for the evidentiary purposes the audit log must serve.

The governance-ready tool

A tool is governance-ready when its design includes each of these, from a specification a reviewer can read:

Stated purpose, in one sentence.
Authorised agent identities.
Parameter schema with constraints.
Least-privilege invocation (what underlying permissions are used).
Rate and cost caps.
Result sanitisation rules.
Audit fields emitted.
Owner and review cadence.

A tool specification that cannot list each item is not production-ready. The Methodology Lead should return it to the engineering team for completion before the agent that uses it can be classified at any level above Level 1 (advisor) in Article 3 terms.

Learning outcomes — confirm

A specialist who completes this article should be able to:

Recite the six control categories and map each to the LLM06 sub-patterns it addresses.
Design a governance-ready tool specification for a described tool.
Diagnose a described incident as a failure in one or more control categories.
Evaluate a tool-use design against OWASP LLM06 and name the deficiencies.

Cross-references

EATF-Level-1/M1.5-Art12-Safety-Boundaries-and-Containment-for-Autonomous-AI.md — safety boundaries and containment.
Article 3 of this credential — autonomy classification (which tool additions trigger reclassification).
Article 7 of this credential — memory governance (result sanitisation feeds memory policy).
Article 11 of this credential — kill-switch and containment (tool-use audit is the incident-response substrate).

Diagrams

StageGateFlow — tool-use flow: agent requests → permission check → parameter validation → execution → result sanitiser → audit log.
HubSpokeDiagram — tool-use governance hub with spokes: allow-list, permission scope, rate limit, cost cap, audit, alert.

Quality rubric — self-assessment

Dimension	Self-score (of 10)
Technical accuracy (LLM06 mapping verifiable against OWASP text)	10
Technology neutrality (LangGraph, CrewAI, AutoGen, OpenAI Agents SDK, LlamaIndex, hand-built all named)	10
Real-world examples ≥2 (OWASP LLM06, Rehberger, Chevrolet)	10
AI-fingerprint patterns	9
Cross-reference fidelity	10
Word count (target 2,500 ± 10%)	10
Weighted total	92 / 100