Model Context Protocol Security Standards: A 12-Control Hardening Baseline

Q: How is MCP security different from regular API security?

Traditional APIs assume a human-operated client that enforces its own policy. MCP assumes an LLM-operated client that can be steered by adversarial content in prompts, documents, and tool outputs. The security model must therefore protect the server from a potentially manipulated client, not just authenticate it.

Q: What is the single highest-impact control for MCP deployments?

Tool scope isolation with short-lived, per-tool OAuth tokens. The majority of catastrophic MCP incidents trace to a single overly-broad credential being issued to a server that then fans it out across every tool the LLM might invoke.

Q: Do we need mutual TLS if we already use OAuth?

Yes. OAuth authenticates the user identity; mTLS authenticates the client process itself. Without mTLS, a stolen OAuth token can be replayed from any host. For MCP servers handling regulated data, both layers are required.

Q: How do we defend against prompt injection through tool output?

Treat every byte returned by an MCP tool as untrusted data, not as instructions. Mark the channel explicitly in the system prompt, escape instruction-like patterns at the server boundary, and run canary injection tests against every production server weekly.

Q: Which MCP controls map to ISO 42001 and NIST CSF 2.0?

Every control maps to ISO 42001 A.6.2.6 (system operation), A.8 (supplier relationships), and A.9.2 (monitoring); and to NIST CSF 2.0 PROTECT (PR.AA access control, PR.DS data security), DETECT (DE.CM continuous monitoring), and GOVERN (GV.SC supply chain). The mapping table in this article is exhaustive.

Q: What is the minimum viable MCP security posture for a pilot?

Mutual TLS, short-lived OAuth tokens scoped per tool, server-side parameter schema validation, audit logging, and a network egress allowlist. Five controls, deployable in a single sprint, block roughly 80 percent of known MCP attack paths.

FlowRidge

COMPEL Body of Knowledge — Agentic Governance Series (Cluster D) Protocol Hardening Standard

What is MCP and why it needs its own security standard {#why}

The Model Context Protocol (MCP) is Anthropic’s open protocol, published in late 2024, for connecting AI assistants to external data systems and tools. It defines a JSON-RPC interface over stdio or HTTP+SSE between an MCP host (the AI application), an MCP client (an in-host transport adapter), and an MCP server (a process that exposes tools, resources, and prompts). The protocol has rapidly become the dominant integration surface for enterprise LLM deployments: vendors ship MCP servers for CRMs, code repositories, data warehouses, and internal APIs, and hosts — Claude Desktop, Claude Code, Cline, and a growing list of third parties — consume them through a common adapter.

A single protocol concentrates risk. A compromised server can steer an LLM into exfiltrating data, executing arbitrary code, or invoking tools against systems of record. A legitimate server that returns attacker-controlled content can inject instructions into the context window. A client that does not scope credentials can hand a database admin token to a weather-lookup tool.

Traditional API security assumes a deterministic client whose logic is fixed at build time — it calls POST /transfer only when a human presses “Transfer.” MCP breaks that assumption. The client is an LLM whose behavior is shaped at inference time by the content it reads, including responses from the servers it talks to. A web page, a support ticket, or a PDF can contain text that manipulates the client into calling tools it should not.

MCP therefore requires controls traditional API security omits: protocol-level integrity so a server cannot be impersonated mid-session; tool-scope isolation so a credential leaks only one tool’s capabilities; output sanitization so tool results are treated as data, never instructions; and supply-chain verification because an MCP server is executable code the host runs with user privileges.

This article specifies a 12-control hardening baseline. Each control is paired with a reference-implementation note, mapped to NIST CSF 2.0 and ISO/IEC 42001, and placed in the COMPEL lifecycle.

MCP threat model {#threat-model}

Before specifying controls, we enumerate the threats they counter. The MCP attack surface has seven distinct classes; a baseline that does not address all seven is incomplete.

1. Malicious MCP servers

An attacker publishes an MCP server on npm, PyPI, GitHub, or a plugin marketplace, advertising useful tools. Once installed, it runs as a persistent process with the user’s privileges — reading local files, exfiltrating environment variables, harvesting OAuth tokens, and shipping confabulated tool descriptions that steer the LLM toward attacker-controlled actions.

2. Compromised legitimate MCP servers

A trusted server is compromised via dependency hijack, stolen signing key, publisher breach, or insider threat. Customers who auto-update receive the backdoored build. The SolarWinds class of attack reappears, but the payload now runs inside the LLM’s tool-call loop.

3. Tool abuse — the confused deputy

The server holds a legitimate credential and the LLM is tricked into using it for an illegitimate purpose. Classic case: an agent asked to “clean up my drafts” is manipulated by a hidden instruction in one draft into emailing the rest to an attacker. The server authenticated the call; it did the wrong thing with valid authority.

4. Token and secret leakage

Bearer tokens flow through headers and environment variables. A log line that echoes a header, a debug dump, or a tool response that accidentally includes credential material leaks secrets. Because MCP servers are long-lived, one leak persists until the token expires.

5. Cross-tenant data access

In multi-tenant SaaS, a single MCP server serves many customers. A missing WHERE tenant_id = ? clause, or a connection pool keyed by credential rather than tenant, leaks tenant A’s data to tenant B.

6. Prompt injection via tool output

The LLM calls a legitimate tool that returns third-party content — a search result, a customer email — containing instructions the LLM follows. The server is not compromised; the data is. Variants include invisible Unicode directives, steganographic payloads in images, and instructions in downstream error messages.

7. Supply-chain attacks

The server’s own dependencies are compromised. Because MCP servers run with user credentials and network access, dependency compromise equals endpoint compromise. The 2024 xz-utils incident and the 2025 wave of typosquatted mcp-* packages show the threat is already active.

The 12 controls below are calibrated so that a comprehensive deployment blocks all seven threat classes with layered defenses — no single threat relies on a single control.

The 12-control hardening checklist {#controls}

Control 1 — Mutual TLS between client and server

What it does. Both the MCP client and the MCP server present X.509 certificates and verify each other. This prevents a rogue process on the same host from impersonating a server, defends against network-level man-in-the-middle even when DNS or routing is compromised, and provides a cryptographic identity independent of user-level tokens.

Reference implementation. Issue client and server certificates from an internal PKI with short lifetimes (24–72 hours). Pin the server certificate fingerprint in client config and refuse non-matching connections. For stdio transports, enforce the equivalent via signed launch manifests and process attestation. SPIFFE/SPIRE fits large deployments.

Control 2 — Short-lived OAuth tokens scoped per tool

What it does. Every tool call is authenticated by an OAuth 2.1 access token scoped to exactly the tool being invoked and expiring within 15 minutes. Tokens are minted by a trusted authorization server on demand, with urn:mcp:tool:* scopes that describe the operation, not the API.

Reference implementation. Put an internal STS (security token service) in front of the MCP server. The client requests a token with scope=mcp:tool:email:send:recipient=<allowlisted> and receives a JWT valid for 15 minutes. The server verifies signature and scope on every call; it never holds long-lived downstream credentials. Downstream calls use RFC 8693 token exchange for narrow per-service tokens.

Control 3 — Tool scope isolation and allowlisting

What it does. Each MCP server declares a capability manifest listing the tools it exposes and the scopes each tool requires. The host enforces an organization-wide allowlist that pins which servers and tools a given agent role may invoke. Requests for tools outside the allowlist are denied at the orchestrator layer before any network call is issued.

Reference implementation. Store the allowlist as a versioned YAML document signed by the governance team’s key. The host loads it at startup and on policy-system webhook. Agents under a given role see only their role’s tool union; disallowed tools are redacted from the LLM’s view entirely — not merely denied at call time — removing them from the attack surface.

Control 4 — Rate limiting per client, per tool, per tenant

What it does. Limits the blast radius of any single compromise. A leaked token cannot be used to enumerate an entire dataset; a looping agent cannot drive cost explosions; a prompt-injected exfiltration cannot run faster than the allowed per-tool budget.

Reference implementation. Three-dimensional limits — client × tool × tenant — enforced at the server with a Redis-backed token-bucket. Defaults are conservative (10 calls/minute per client per tool) with per-deployment tuning. Violations trigger a soft alert at first breach and a hard lockout after three breaches in five minutes. Bucket keys include tenant ID to prevent cross-tenant exhaustion.

Control 5 — Parameter schema validation (server-side)

What it does. Every MCP tool declares a JSON Schema for its parameters. The server validates every incoming call against the schema and rejects any deviation — extra fields, type mismatches, out-of-range values, unknown enums. This closes the most common confused-deputy attack path, where an LLM is steered into setting a parameter the tool was never designed to accept.

Reference implementation. Publish schemas alongside the tool manifest. The MCP server uses a battle-tested validator (Ajv for Node, jsonschema for Python) in strict mode with additionalProperties: false. Constraints must include format (regex, email, URI), numeric bounds, string length, and enum membership. Nullable fields must be explicit. Schema failures return a typed error that the orchestrator logs without echoing the offending payload back to the LLM.

Control 6 — Output sanitization (mark tool output as untrusted)

What it does. Ensures the LLM cannot mistake data returned by a tool for instructions from the user or system. The server wraps every response in an untrusted-content envelope; the host prompt template explicitly tells the LLM that content inside the envelope is data to reason about, not commands to execute.

Reference implementation. Two layers. At the server, escape or strip patterns resembling prompt directives (<|im_start|>, ###, “ignore previous”, invisible Unicode tag characters U+E0000–U+E007F, BOM-smuggling). At the host, wrap sanitized responses in a structured container (e.g. <mcp_result tool="..." provenance="untrusted">...</mcp_result>) with a system instruction to treat such containers as evidence, not directives. Deploy canary instructions in safe test servers and alert if the LLM follows them.

Control 7 — Audit logging with tamper-evident signatures

What it does. Creates a forensic record that survives insider threats and post-incident tampering. Every MCP message — connection establishment, capability advertisement, tool call, tool result, error — is logged with a cryptographic hash chained to the previous entry, so that any modification or deletion of a single log line invalidates the chain from that point forward.

Reference implementation. Hash-chained append-only log in an S3 Object Lock bucket or equivalent WORM store. Each entry records timestamp, tenant, user, agent run ID, server ID, tool, parameter hash, result hash (not contents), latency, and an HMAC over the previous entry’s hash. A daily Merkle root is signed by an HSM key and published to an internal transparency log. On-call engineers verify the chain end-to-end in under five minutes with the shipped verifier.

Control 8 — Secret rotation and HSM-backed key storage

What it does. Limits the window of usefulness for any stolen secret and ensures long-lived cryptographic identities (signing keys, root certificates, OAuth client secrets) are never exposed in process memory or configuration files.

Reference implementation. Automated 24-hour rotation for OAuth client secrets and database passwords; 90-day TLS certificates; annual signing-key rotation. Secrets are brokered via a manager (Vault, AWS Secrets Manager, Azure Key Vault) with instance-profile auth — no static credentials on disk. Root signing keys live in an HSM or cloud KMS; servers call the KMS to sign rather than holding key material. Rotation failures page on-call immediately.

Control 9 — Replay attack defense (nonce and timestamp)

What it does. Prevents an attacker who captures a valid MCP request from replaying it later. Every request includes a nonce and a timestamp; the server rejects requests outside a five-minute window or reusing a recent nonce.

Reference implementation. Each client-to-server request carries X-MCP-Nonce (128-bit random) and X-MCP-Timestamp (Unix millisecond) headers, both covered by the HMAC signature. The server maintains a sliding-window nonce cache (Redis with TTL=600s) and rejects replays with HTTP 409. For stdio transports the equivalent is a per-session sequence number enforced by the client launcher. Clock skew beyond ±90 seconds triggers a warning; beyond ±300 seconds triggers rejection.

Control 10 — Network egress allowlist for MCP servers

What it does. Blocks data exfiltration even if an MCP server is compromised. The server runs in a network namespace whose outbound connections are restricted to a named allowlist of domains and ports required by the tools it exposes.

Reference implementation. Deploy MCP servers into a dedicated Kubernetes namespace with a default-deny NetworkPolicy. Explicit egress rules list only the downstream APIs each tool needs (for example, api.salesforce.com:443 for a CRM tool). DNS resolution goes through an internal resolver that blocks lookups to non-allowlisted domains. Egress violations trigger a high-severity alert and automatic server quarantine. For stdio servers running on developer workstations, use nft/pf rules or eBPF-based enforcement.

Control 11 — Sandboxing for code-execution tools (per-call microVM)

What it does. Any MCP tool that executes code (REPL, shell, script runner, code formatter, build tool) runs each invocation in a fresh microVM with no persistent state, no network egress by default, and a strict resource budget. This contains both malicious code supplied through prompt injection and misbehaving tool implementations.

Reference implementation. Firecracker, gVisor, or Kata Containers backing a per-call sandbox with 256 MB memory, 60-second wall-clock limit, read-only root filesystem, scratch /tmp of 64 MB, and no network. Outputs are captured and returned via the MCP response; the sandbox is destroyed unconditionally at end of call. Code-execution tools must declare the egress domains they need (rare) and the orchestrator must approve them under Control 10. Sandbox breakouts (if they occurred) would fail because the underlying host exposes no network and no persistent secrets.

Control 12 — Supply-chain verification (signed manifests and SBOM)

What it does. Ensures the MCP server code that is running is the code that was authorized to run. Every server ships a Software Bill of Materials (SBOM) and a signed manifest; the host verifies both at install time and at every launch.

Reference implementation. Servers are published as OCI artifacts with Sigstore-signed manifests and CycloneDX SBOMs. The host runs cosign verify against a pinned set of trusted signers (the governance team’s identity for internal servers; a small allowlist of vendors for third-party servers). SBOMs are scanned against a vulnerability feed (OSV, GitHub Advisory) daily; findings of severity ≥ High trigger an advisory on the internal MCP catalog and a 72-hour remediation deadline. Unsigned servers cannot be installed; manifests that fail verification cannot launch.

Mapping each control to NIST CSF 2.0 and ISO 42001 {#compliance-mapping}

#	Control	NIST CSF 2.0	ISO/IEC 42001 Annex A
1	Mutual TLS	PR.AA-05 (authentication mechanisms) · PR.DS-02 (data-in-transit)	A.6.2.6 · A.6.2.8
2	Short-lived OAuth scoped per tool	PR.AA-01 (identities) · PR.AA-03 (users authenticated)	A.6.2.6 · A.9.2
3	Tool scope isolation and allowlisting	PR.AA-05 · PR.IR-01 (network boundaries)	A.6.2.6 · A.8.2
4	Rate limiting per client/tool/tenant	PR.IR-03 (resilience against DoS) · DE.CM-01 (monitoring)	A.9.2 · A.6.2.8
5	Parameter schema validation	PR.DS-02 · PR.PS-06 (secure software development)	A.6.2.6 · A.6.2.5
6	Output sanitization	PR.DS-02 · DE.CM-09 (hardware/software integrity)	A.6.2.6 · A.6.2.8
7	Audit logging with tamper-evident signatures	DE.CM-01 · DE.AE-03 (event aggregation) · PR.DS-02	A.6.2.8 · A.9.2 · A.9.3
8	Secret rotation and HSM storage	PR.AA-01 · PR.DS-01 (data-at-rest)	A.6.2.6 · A.7.3
9	Replay defense (nonce/timestamp)	PR.AA-05 · PR.DS-02	A.6.2.6
10	Network egress allowlist	PR.IR-01 · DE.CM-01	A.6.2.7 · A.9.2
11	Sandboxing for code execution	PR.PS-01 (configuration management) · PR.PS-05 (execution integrity)	A.6.2.6 · A.6.2.7
12	Supply-chain verification (SBOM, signed manifests)	GV.SC-01 through GV.SC-10 (supply chain) · PR.PS-02 (software maintenance)	A.8.2 · A.8.3 · A.10.3

Every cell maps to an auditable artifact. Deployments implementing all 12 controls satisfy roughly 85% of ISO 42001 Annex A.6 AI-system-operation evidence, and produce the access-control, data-protection, and supply-chain evidence for NIST CSF 2.0 PROTECT and GOVERN audits.

Reference architecture {#reference-architecture}

                       ┌────────────────────────────────┐
                       │        Policy Repository       │
                       │  (signed allowlist · manifests)│
                       └──────────────┬─────────────────┘
                                      │ load / refresh
                                      ▼
┌─────────────┐     mTLS + OAuth    ┌──────────────────┐    mTLS    ┌─────────────────┐
│  LLM Host   │  ◄────────────────► │   MCP Client     │◄──────────►│   MCP Server    │
│  (Claude,   │                     │  (in-process     │            │   (sandboxed,   │
│   Cline,    │                     │   adapter)       │            │   egress-fenced)│
│   etc.)     │                     │                  │            │                 │
└──────┬──────┘                     └────────┬─────────┘            └────────┬────────┘
       │                                     │                               │
       │                                     │ nonce + timestamp            │ per-tool
       │                                     ▼                               ▼ STS token
       │                            ┌─────────────────┐              ┌──────────────┐
       │                            │  Audit Logger   │              │  Sandbox     │
       │                            │  (hash-chained  │              │  (microVM    │
       │                            │   WORM store)   │              │   per call)  │
       │                            └─────────────────┘              └──────────────┘
       │
       ▼
┌─────────────────┐
│   HSM / KMS     │
│   (rotation,    │
│    signing)     │
└─────────────────┘

The diagram captures the minimum component set a 12-control deployment requires. Note the separation of concerns: the policy repository is the single source of truth for allowlists and manifests; the audit logger is write-only from the client’s perspective and externally verifiable; the HSM/KMS is the only place long-lived keys live; sandboxes are ephemeral and per-call.

COMPEL stage mapping {#compel-mapping}

COMPEL stage	MCP security focus
Calibrate	Inventory MCP servers in use · classify by data sensitivity and blast radius · identify unsigned or shadow installations
Organize	Assign an MCP owner per server · establish the MCP allowlist committee · define publisher-trust tiers
Model	Author capability manifests · design tool scopes · write parameter schemas · select sandbox technology
Produce	Deploy with all 12 controls enabled from day one · integrate into CI/CD for signed publishing · wire audit logs to SIEM
Evaluate	Run the MCP red-team suite · test each control with negative cases · verify compliance mapping · measure MTTD and MTTR
Learn	Post-incident reviews · dependency advisory sweep · manifest and allowlist revisions · rotation of signing keys

Evidence artifacts {#evidence}

MCP server inventory with publisher, version, SBOM link, and trust tier
Signed allowlist document per agent role
Capability manifest per MCP server
Tool parameter schemas under version control
mTLS certificate issuance policy and PKI audit log
OAuth scope catalog (urn:mcp:tool:*)
Rate-limit configuration file and its change history
Audit-log Merkle roots (daily, signed) and verification run reports
Secret rotation logs and HSM attestation reports
Nonce cache health dashboard
NetworkPolicy manifests for MCP namespaces
Sandbox configuration and breakout test reports
Cosign verification records per server launch
Vulnerability advisory tracking per server
Red-team reports against each control (quarterly)

Metrics {#metrics}

MCP control coverage — percentage of registered servers enforcing all 12 controls. Target: 100%. Report: weekly.
Mean token lifetime — P50 and P99 lifetime of OAuth tokens issued to MCP clients. Target: P99 ≤ 15 minutes.
Denied-call rate — percentage of MCP tool calls denied by allowlist or schema validation. Baseline: 0.5–3%. Spikes warrant investigation.
Prompt-injection success rate — percentage of red-team canary injections that change behavior. Target: < 1% and decreasing quarter-over-quarter.
Sandbox breakout attempts — count of sandbox escape signals per month. Target: 0; any non-zero value is an incident.
Audit-log chain verification time — seconds to verify the last 30 days of audit chain. Target: < 300 seconds; any gap is an incident.
SBOM freshness — days since SBOM scan for each active server. Target: < 1 day for P0 servers, < 7 days for all others.
Rotation compliance — percentage of secrets rotated within their declared window. Target: 100%.
Egress policy violations — count of blocked egress attempts per MCP server per week. Spikes indicate a compromised or misconfigured server.

Risks if skipped {#risks}

Organizations that deploy MCP without this baseline face concrete failure modes:

Data exfiltration via malicious or compromised servers exploiting overbroad credentials. Public 2025 incidents included unauthorized CRM and repo exports through community MCP servers.
Prompt-injection-driven tool abuse — an agent tricked into sending emails, making payments, or deleting records because Control 6 was absent.
Cross-tenant leakage in multi-tenant SaaS, where missing tenant scope in the server’s query layer exposes one customer to another.
Supply-chain compromise through typosquatted or dependency-hijacked servers executing with user privileges.
Audit failure — inability to reconstruct agent actions in an incident window because logs were incomplete, unsigned, or attacker-deleted.
Regulatory breach under ISO 42001 scope, EU AI Act Article 15, and sector rules (HIPAA, PCI-DSS, DORA) that increasingly incorporate AI supply-chain controls.
Loss of the right to deploy MCP — regulated customers and government buyers are beginning to require baseline attestations before allowing vendors to connect.

The cost of implementing the 12 controls is weeks of engineering. The cost of skipping them is incidents that end up in customer contracts and regulatory filings.

References {#references}

Model Context Protocol specification — modelcontextprotocol.io/specification
Anthropic — Introducing the Model Context Protocol — anthropic.com/news/model-context-protocol
OWASP Top 10 for Agentic AI Applications — genai.owasp.org/llm-top-10-agentic/
NIST Cybersecurity Framework 2.0 — nist.gov/cyberframework
ISO/IEC 42001:2023 — AI Management Systems — iso.org/standard/81230.html
NIST AI Risk Management Framework 1.0 — nist.gov/itl/ai-risk-management-framework
OAuth 2.1 Authorization Framework — datatracker.ietf.org/doc/html/draft-ietf-oauth-v2-1
RFC 8693 — OAuth 2.0 Token Exchange — datatracker.ietf.org/doc/html/rfc8693
Sigstore / Cosign — sigstore.dev
CycloneDX SBOM specification — cyclonedx.org
SPIFFE / SPIRE — spiffe.io
EU AI Act Article 15 (accuracy, robustness, cybersecurity) — eur-lex.europa.eu

How to cite

COMPEL FlowRidge Team. (2026). “Model Context Protocol Security Standards: A 12-Control Hardening Baseline.” COMPEL Framework by FlowRidge. https://www.compelframework.org/articles/seo-d1-model-context-protocol-security-standards/