COMPEL Body of Knowledge — Agentic Governance Series (Cluster D) Protocol Hardening Standard
What is MCP and why it needs its own security standard {#why}
The Model Context Protocol (MCP) is Anthropic’s open protocol, published in late 2024, for connecting AI assistants to external data systems and tools. It defines a JSON-RPC interface over stdio or HTTP+SSE between an MCP host (the AI application), an MCP client (an in-host transport adapter), and an MCP server (a process that exposes tools, resources, and prompts). The protocol has rapidly become the dominant integration surface for enterprise LLM deployments: vendors ship MCP servers for CRMs, code repositories, data warehouses, and internal APIs, and hosts — Claude Desktop, Claude Code, Cline, and a growing list of third parties — consume them through a common adapter.
A single protocol concentrates risk. A compromised server can steer an LLM into exfiltrating data, executing arbitrary code, or invoking tools against systems of record. A legitimate server that returns attacker-controlled content can inject instructions into the context window. A client that does not scope credentials can hand a database admin token to a weather-lookup tool.
Traditional API security assumes a deterministic client whose logic is fixed at build time — it calls POST /transfer only when a human presses “Transfer.” MCP breaks that assumption. The client is an LLM whose behavior is shaped at inference time by the content it reads, including responses from the servers it talks to. A web page, a support ticket, or a PDF can contain text that manipulates the client into calling tools it should not.
MCP therefore requires controls traditional API security omits: protocol-level integrity so a server cannot be impersonated mid-session; tool-scope isolation so a credential leaks only one tool’s capabilities; output sanitization so tool results are treated as data, never instructions; and supply-chain verification because an MCP server is executable code the host runs with user privileges.
This article specifies a 12-control hardening baseline. Each control is paired with a reference-implementation note, mapped to NIST CSF 2.0 and ISO/IEC 42001, and placed in the COMPEL lifecycle.
MCP threat model {#threat-model}
Before specifying controls, we enumerate the threats they counter. The MCP attack surface has seven distinct classes; a baseline that does not address all seven is incomplete.
1. Malicious MCP servers
An attacker publishes an MCP server on npm, PyPI, GitHub, or a plugin marketplace, advertising useful tools. Once installed, it runs as a persistent process with the user’s privileges — reading local files, exfiltrating environment variables, harvesting OAuth tokens, and shipping confabulated tool descriptions that steer the LLM toward attacker-controlled actions.
2. Compromised legitimate MCP servers
A trusted server is compromised via dependency hijack, stolen signing key, publisher breach, or insider threat. Customers who auto-update receive the backdoored build. The SolarWinds class of attack reappears, but the payload now runs inside the LLM’s tool-call loop.
3. Tool abuse — the confused deputy
The server holds a legitimate credential and the LLM is tricked into using it for an illegitimate purpose. Classic case: an agent asked to “clean up my drafts” is manipulated by a hidden instruction in one draft into emailing the rest to an attacker. The server authenticated the call; it did the wrong thing with valid authority.
4. Token and secret leakage
Bearer tokens flow through headers and environment variables. A log line that echoes a header, a debug dump, or a tool response that accidentally includes credential material leaks secrets. Because MCP servers are long-lived, one leak persists until the token expires.
5. Cross-tenant data access
In multi-tenant SaaS, a single MCP server serves many customers. A missing WHERE tenant_id = ? clause, or a connection pool keyed by credential rather than tenant, leaks tenant A’s data to tenant B.
6. Prompt injection via tool output
The LLM calls a legitimate tool that returns third-party content — a search result, a customer email — containing instructions the LLM follows. The server is not compromised; the data is. Variants include invisible Unicode directives, steganographic payloads in images, and instructions in downstream error messages.
7. Supply-chain attacks
The server’s own dependencies are compromised. Because MCP servers run with user credentials and network access, dependency compromise equals endpoint compromise. The 2024 xz-utils incident and the 2025 wave of typosquatted mcp-* packages show the threat is already active.
The 12 controls below are calibrated so that a comprehensive deployment blocks all seven threat classes with layered defenses — no single threat relies on a single control.
The 12-control hardening checklist {#controls}
Control 1 — Mutual TLS between client and server
What it does. Both the MCP client and the MCP server present X.509 certificates and verify each other. This prevents a rogue process on the same host from impersonating a server, defends against network-level man-in-the-middle even when DNS or routing is compromised, and provides a cryptographic identity independent of user-level tokens.
Reference implementation. Issue client and server certificates from an internal PKI with short lifetimes (24–72 hours). Pin the server certificate fingerprint in client config and refuse non-matching connections. For stdio transports, enforce the equivalent via signed launch manifests and process attestation. SPIFFE/SPIRE fits large deployments.
Control 2 — Short-lived OAuth tokens scoped per tool
What it does. Every tool call is authenticated by an OAuth 2.1 access token scoped to exactly the tool being invoked and expiring within 15 minutes. Tokens are minted by a trusted authorization server on demand, with urn:mcp:tool:* scopes that describe the operation, not the API.
Reference implementation. Put an internal STS (security token service) in front of the MCP server. The client requests a token with scope=mcp:tool:email:send:recipient=<allowlisted> and receives a JWT valid for 15 minutes. The server verifies signature and scope on every call; it never holds long-lived downstream credentials. Downstream calls use RFC 8693 token exchange for narrow per-service tokens.
Control 3 — Tool scope isolation and allowlisting
What it does. Each MCP server declares a capability manifest listing the tools it exposes and the scopes each tool requires. The host enforces an organization-wide allowlist that pins which servers and tools a given agent role may invoke. Requests for tools outside the allowlist are denied at the orchestrator layer before any network call is issued.
Reference implementation. Store the allowlist as a versioned YAML document signed by the governance team’s key. The host loads it at startup and on policy-system webhook. Agents under a given role see only their role’s tool union; disallowed tools are redacted from the LLM’s view entirely — not merely denied at call time — removing them from the attack surface.
Control 4 — Rate limiting per client, per tool, per tenant
What it does. Limits the blast radius of any single compromise. A leaked token cannot be used to enumerate an entire dataset; a looping agent cannot drive cost explosions; a prompt-injected exfiltration cannot run faster than the allowed per-tool budget.
Reference implementation. Three-dimensional limits — client × tool × tenant — enforced at the server with a Redis-backed token-bucket. Defaults are conservative (10 calls/minute per client per tool) with per-deployment tuning. Violations trigger a soft alert at first breach and a hard lockout after three breaches in five minutes. Bucket keys include tenant ID to prevent cross-tenant exhaustion.
Control 5 — Parameter schema validation (server-side)
What it does. Every MCP tool declares a JSON Schema for its parameters. The server validates every incoming call against the schema and rejects any deviation — extra fields, type mismatches, out-of-range values, unknown enums. This closes the most common confused-deputy attack path, where an LLM is steered into setting a parameter the tool was never designed to accept.
Reference implementation. Publish schemas alongside the tool manifest. The MCP server uses a battle-tested validator (Ajv for Node, jsonschema for Python) in strict mode with additionalProperties: false. Constraints must include format (regex, email, URI), numeric bounds, string length, and enum membership. Nullable fields must be explicit. Schema failures return a typed error that the orchestrator logs without echoing the offending payload back to the LLM.
Control 6 — Output sanitization (mark tool output as untrusted)
What it does. Ensures the LLM cannot mistake data returned by a tool for instructions from the user or system. The server wraps every response in an untrusted-content envelope; the host prompt template explicitly tells the LLM that content inside the envelope is data to reason about, not commands to execute.
Reference implementation. Two layers. At the server, escape or strip patterns resembling prompt directives (<|im_start|>, ###, “ignore previous”, invisible Unicode tag characters U+E0000–U+E007F, BOM-smuggling). At the host, wrap sanitized responses in a structured container (e.g. <mcp_result tool="..." provenance="untrusted">...</mcp_result>) with a system instruction to treat such containers as evidence, not directives. Deploy canary instructions in safe test servers and alert if the LLM follows them.
Control 7 — Audit logging with tamper-evident signatures
What it does. Creates a forensic record that survives insider threats and post-incident tampering. Every MCP message — connection establishment, capability advertisement, tool call, tool result, error — is logged with a cryptographic hash chained to the previous entry, so that any modification or deletion of a single log line invalidates the chain from that point forward.
Reference implementation. Hash-chained append-only log in an S3 Object Lock bucket or equivalent WORM store. Each entry records timestamp, tenant, user, agent run ID, server ID, tool, parameter hash, result hash (not contents), latency, and an HMAC over the previous entry’s hash. A daily Merkle root is signed by an HSM key and published to an internal transparency log. On-call engineers verify the chain end-to-end in under five minutes with the shipped verifier.
Control 8 — Secret rotation and HSM-backed key storage
What it does. Limits the window of usefulness for any stolen secret and ensures long-lived cryptographic identities (signing keys, root certificates, OAuth client secrets) are never exposed in process memory or configuration files.
Reference implementation. Automated 24-hour rotation for OAuth client secrets and database passwords; 90-day TLS certificates; annual signing-key rotation. Secrets are brokered via a manager (Vault, AWS Secrets Manager, Azure Key Vault) with instance-profile auth — no static credentials on disk. Root signing keys live in an HSM or cloud KMS; servers call the KMS to sign rather than holding key material. Rotation failures page on-call immediately.
Control 9 — Replay attack defense (nonce and timestamp)
What it does. Prevents an attacker who captures a valid MCP request from replaying it later. Every request includes a nonce and a timestamp; the server rejects requests outside a five-minute window or reusing a recent nonce.
Reference implementation. Each client-to-server request carries X-MCP-Nonce (128-bit random) and X-MCP-Timestamp (Unix millisecond) headers, both covered by the HMAC signature. The server maintains a sliding-window nonce cache (Redis with TTL=600s) and rejects replays with HTTP 409. For stdio transports the equivalent is a per-session sequence number enforced by the client launcher. Clock skew beyond ±90 seconds triggers a warning; beyond ±300 seconds triggers rejection.
Control 10 — Network egress allowlist for MCP servers
What it does. Blocks data exfiltration even if an MCP server is compromised. The server runs in a network namespace whose outbound connections are restricted to a named allowlist of domains and ports required by the tools it exposes.
Reference implementation. Deploy MCP servers into a dedicated Kubernetes namespace with a default-deny NetworkPolicy. Explicit egress rules list only the downstream APIs each tool needs (for example, api.salesforce.com:443 for a CRM tool). DNS resolution goes through an internal resolver that blocks lookups to non-allowlisted domains. Egress violations trigger a high-severity alert and automatic server quarantine. For stdio servers running on developer workstations, use nft/pf rules or eBPF-based enforcement.
Control 11 — Sandboxing for code-execution tools (per-call microVM)
What it does. Any MCP tool that executes code (REPL, shell, script runner, code formatter, build tool) runs each invocation in a fresh microVM with no persistent state, no network egress by default, and a strict resource budget. This contains both malicious code supplied through prompt injection and misbehaving tool implementations.
Reference implementation. Firecracker, gVisor, or Kata Containers backing a per-call sandbox with 256 MB memory, 60-second wall-clock limit, read-only root filesystem, scratch /tmp of 64 MB, and no network. Outputs are captured and returned via the MCP response; the sandbox is destroyed unconditionally at end of call. Code-execution tools must declare the egress domains they need (rare) and the orchestrator must approve them under Control 10. Sandbox breakouts (if they occurred) would fail because the underlying host exposes no network and no persistent secrets.
Control 12 — Supply-chain verification (signed manifests and SBOM)
What it does. Ensures the MCP server code that is running is the code that was authorized to run. Every server ships a Software Bill of Materials (SBOM) and a signed manifest; the host verifies both at install time and at every launch.
Reference implementation. Servers are published as OCI artifacts with Sigstore-signed manifests and CycloneDX SBOMs. The host runs cosign verify against a pinned set of trusted signers (the governance team’s identity for internal servers; a small allowlist of vendors for third-party servers). SBOMs are scanned against a vulnerability feed (OSV, GitHub Advisory) daily; findings of severity ≥ High trigger an advisory on the internal MCP catalog and a 72-hour remediation deadline. Unsigned servers cannot be installed; manifests that fail verification cannot launch.
Mapping each control to NIST CSF 2.0 and ISO 42001 {#compliance-mapping}
| # | Control | NIST CSF 2.0 | ISO/IEC 42001 Annex A |
|---|---|---|---|
| 1 | Mutual TLS | PR.AA-05 (authentication mechanisms) · PR.DS-02 (data-in-transit) | A.6.2.6 · A.6.2.8 |
| 2 | Short-lived OAuth scoped per tool | PR.AA-01 (identities) · PR.AA-03 (users authenticated) | A.6.2.6 · A.9.2 |
| 3 | Tool scope isolation and allowlisting | PR.AA-05 · PR.IR-01 (network boundaries) | A.6.2.6 · A.8.2 |
| 4 | Rate limiting per client/tool/tenant | PR.IR-03 (resilience against DoS) · DE.CM-01 (monitoring) | A.9.2 · A.6.2.8 |
| 5 | Parameter schema validation | PR.DS-02 · PR.PS-06 (secure software development) | A.6.2.6 · A.6.2.5 |
| 6 | Output sanitization | PR.DS-02 · DE.CM-09 (hardware/software integrity) | A.6.2.6 · A.6.2.8 |
| 7 | Audit logging with tamper-evident signatures | DE.CM-01 · DE.AE-03 (event aggregation) · PR.DS-02 | A.6.2.8 · A.9.2 · A.9.3 |
| 8 | Secret rotation and HSM storage | PR.AA-01 · PR.DS-01 (data-at-rest) | A.6.2.6 · A.7.3 |
| 9 | Replay defense (nonce/timestamp) | PR.AA-05 · PR.DS-02 | A.6.2.6 |
| 10 | Network egress allowlist | PR.IR-01 · DE.CM-01 | A.6.2.7 · A.9.2 |
| 11 | Sandboxing for code execution | PR.PS-01 (configuration management) · PR.PS-05 (execution integrity) | A.6.2.6 · A.6.2.7 |
| 12 | Supply-chain verification (SBOM, signed manifests) | GV.SC-01 through GV.SC-10 (supply chain) · PR.PS-02 (software maintenance) | A.8.2 · A.8.3 · A.10.3 |
Every cell maps to an auditable artifact. Deployments implementing all 12 controls satisfy roughly 85% of ISO 42001 Annex A.6 AI-system-operation evidence, and produce the access-control, data-protection, and supply-chain evidence for NIST CSF 2.0 PROTECT and GOVERN audits.
Reference architecture {#reference-architecture}
┌────────────────────────────────┐
│ Policy Repository │
│ (signed allowlist · manifests)│
└──────────────┬─────────────────┘
│ load / refresh
▼
┌─────────────┐ mTLS + OAuth ┌──────────────────┐ mTLS ┌─────────────────┐
│ LLM Host │ ◄────────────────► │ MCP Client │◄──────────►│ MCP Server │
│ (Claude, │ │ (in-process │ │ (sandboxed, │
│ Cline, │ │ adapter) │ │ egress-fenced)│
│ etc.) │ │ │ │ │
└──────┬──────┘ └────────┬─────────┘ └────────┬────────┘
│ │ │
│ │ nonce + timestamp │ per-tool
│ ▼ ▼ STS token
│ ┌─────────────────┐ ┌──────────────┐
│ │ Audit Logger │ │ Sandbox │
│ │ (hash-chained │ │ (microVM │
│ │ WORM store) │ │ per call) │
│ └─────────────────┘ └──────────────┘
│
▼
┌─────────────────┐
│ HSM / KMS │
│ (rotation, │
│ signing) │
└─────────────────┘
The diagram captures the minimum component set a 12-control deployment requires. Note the separation of concerns: the policy repository is the single source of truth for allowlists and manifests; the audit logger is write-only from the client’s perspective and externally verifiable; the HSM/KMS is the only place long-lived keys live; sandboxes are ephemeral and per-call.
COMPEL stage mapping {#compel-mapping}
| COMPEL stage | MCP security focus |
|---|---|
| Calibrate | Inventory MCP servers in use · classify by data sensitivity and blast radius · identify unsigned or shadow installations |
| Organize | Assign an MCP owner per server · establish the MCP allowlist committee · define publisher-trust tiers |
| Model | Author capability manifests · design tool scopes · write parameter schemas · select sandbox technology |
| Produce | Deploy with all 12 controls enabled from day one · integrate into CI/CD for signed publishing · wire audit logs to SIEM |
| Evaluate | Run the MCP red-team suite · test each control with negative cases · verify compliance mapping · measure MTTD and MTTR |
| Learn | Post-incident reviews · dependency advisory sweep · manifest and allowlist revisions · rotation of signing keys |
Evidence artifacts {#evidence}
- MCP server inventory with publisher, version, SBOM link, and trust tier
- Signed allowlist document per agent role
- Capability manifest per MCP server
- Tool parameter schemas under version control
- mTLS certificate issuance policy and PKI audit log
- OAuth scope catalog (
urn:mcp:tool:*) - Rate-limit configuration file and its change history
- Audit-log Merkle roots (daily, signed) and verification run reports
- Secret rotation logs and HSM attestation reports
- Nonce cache health dashboard
- NetworkPolicy manifests for MCP namespaces
- Sandbox configuration and breakout test reports
- Cosign verification records per server launch
- Vulnerability advisory tracking per server
- Red-team reports against each control (quarterly)
Metrics {#metrics}
- MCP control coverage — percentage of registered servers enforcing all 12 controls. Target: 100%. Report: weekly.
- Mean token lifetime — P50 and P99 lifetime of OAuth tokens issued to MCP clients. Target: P99 ≤ 15 minutes.
- Denied-call rate — percentage of MCP tool calls denied by allowlist or schema validation. Baseline: 0.5–3%. Spikes warrant investigation.
- Prompt-injection success rate — percentage of red-team canary injections that change behavior. Target: < 1% and decreasing quarter-over-quarter.
- Sandbox breakout attempts — count of sandbox escape signals per month. Target: 0; any non-zero value is an incident.
- Audit-log chain verification time — seconds to verify the last 30 days of audit chain. Target: < 300 seconds; any gap is an incident.
- SBOM freshness — days since SBOM scan for each active server. Target: < 1 day for P0 servers, < 7 days for all others.
- Rotation compliance — percentage of secrets rotated within their declared window. Target: 100%.
- Egress policy violations — count of blocked egress attempts per MCP server per week. Spikes indicate a compromised or misconfigured server.
Risks if skipped {#risks}
Organizations that deploy MCP without this baseline face concrete failure modes:
- Data exfiltration via malicious or compromised servers exploiting overbroad credentials. Public 2025 incidents included unauthorized CRM and repo exports through community MCP servers.
- Prompt-injection-driven tool abuse — an agent tricked into sending emails, making payments, or deleting records because Control 6 was absent.
- Cross-tenant leakage in multi-tenant SaaS, where missing tenant scope in the server’s query layer exposes one customer to another.
- Supply-chain compromise through typosquatted or dependency-hijacked servers executing with user privileges.
- Audit failure — inability to reconstruct agent actions in an incident window because logs were incomplete, unsigned, or attacker-deleted.
- Regulatory breach under ISO 42001 scope, EU AI Act Article 15, and sector rules (HIPAA, PCI-DSS, DORA) that increasingly incorporate AI supply-chain controls.
- Loss of the right to deploy MCP — regulated customers and government buyers are beginning to require baseline attestations before allowing vendors to connect.
The cost of implementing the 12 controls is weeks of engineering. The cost of skipping them is incidents that end up in customer contracts and regulatory filings.
References {#references}
- Model Context Protocol specification — modelcontextprotocol.io/specification
- Anthropic — Introducing the Model Context Protocol — anthropic.com/news/model-context-protocol
- OWASP Top 10 for Agentic AI Applications — genai.owasp.org/llm-top-10-agentic/
- NIST Cybersecurity Framework 2.0 — nist.gov/cyberframework
- ISO/IEC 42001:2023 — AI Management Systems — iso.org/standard/81230.html
- NIST AI Risk Management Framework 1.0 — nist.gov/itl/ai-risk-management-framework
- OAuth 2.1 Authorization Framework — datatracker.ietf.org/doc/html/draft-ietf-oauth-v2-1
- RFC 8693 — OAuth 2.0 Token Exchange — datatracker.ietf.org/doc/html/rfc8693
- Sigstore / Cosign — sigstore.dev
- CycloneDX SBOM specification — cyclonedx.org
- SPIFFE / SPIRE — spiffe.io
- EU AI Act Article 15 (accuracy, robustness, cybersecurity) — eur-lex.europa.eu
Related COMPEL articles
- Tool Use and Function Calling in Autonomous AI Systems
- Enterprise Agentic AI Platform Strategy and Multi-Agent Orchestration
- OWASP Top 10 for Agentic AI: Mitigation Playbook
How to cite
COMPEL FlowRidge Team. (2026). “Model Context Protocol Security Standards: A 12-Control Hardening Baseline.” COMPEL Framework by FlowRidge. https://www.compelframework.org/articles/seo-d1-model-context-protocol-security-standards/