Multi-Agent Systems and A2A Protocols

FlowRidge

COMPEL Specialization — AITM-AAG: Agentic AI Governance Associate Article 8 of 14

Definition. A multi-agent system (MAS) is a system composed of multiple interacting agents, each with its own goals, memory, tool access, and identity. An agent-to-agent (A2A) protocol is the communication mechanism by which agents exchange messages — it includes the message format, the authentication of sender and receiver, the authorisation of the exchange, rate limits, and the audit record. MAS governance extends single-agent governance but does not replace it: each agent in an MAS retains its own classification, authority chain, oversight regime, tool-use controls, and memory policy, and also inherits the additional controls that the interaction between agents requires.

The multi-agent field has a long academic history (from Russell & Norvig’s treatment of agent architectures to Park et al.’s UIST 2023 study of generative agents that displayed emergent behaviour). Source: https://arxiv.org/abs/2304.03442. The governance question is newer: how do we run an MAS in production such that the incident, the audit, and the regulator can each be served by legible records?

When to use an MAS — and when not to

Multi-agent is not a superior architecture. It is a more expensive one. An MAS is appropriate when:

The workflow has genuinely parallel sub-tasks that benefit from specialisation (e.g., a researcher agent, an analyst agent, and an editor agent).
Different sub-tasks require different tool surfaces that should not be unified in a single identity.
Different sub-tasks have different oversight regimes.

An MAS is inappropriate — and a single-agent-with-tools is preferable — when:

The workflow is essentially sequential and one agent could carry all the context.
The multi-agent design adds only notional specialisation without clear role boundaries.
The cost of the extra governance burden exceeds the value added.

The governance analyst should ask the engineering team, at design review, to justify MAS against single-agent alternatives. The default answer if no justification is offered is to retract to single-agent.

The four canonical topologies

MAS topologies govern how agents relate to each other. Four recurrent patterns appear in enterprise deployments.

Hierarchical

A manager agent delegates sub-tasks to worker agents and aggregates their output. Authority flows one way — from manager to workers — and the manager has kill-switch authority over workers. Common implementations: LangGraph’s supervisor pattern, AutoGen’s group-chat with an orchestrator, CrewAI’s crew with a manager process, OpenAI Agents SDK handoffs.

Governance characteristic: clear accountability — the manager agent’s authority chain covers the MAS.

Failure mode: manager capture — a compromised manager or a manager with a bad plan yields bad outputs from all workers.

Peer

Agents communicate laterally, with no hierarchical authority. A drafter and a critic may iterate; a researcher and a planner may hand off. Common implementations: AutoGen’s two-agent patterns, hand-coded LangGraph flows with edges between peers.

Governance characteristic: authority is distributed; each agent carries its own chain.

Failure mode: deadlock (agents waiting on each other), feedback loops (drafter-critic pairs that never converge), and the absence of a clear “who-is-in-charge” when things go wrong.

Marketplace

Agents offer services and bid for tasks, or a task is brokered to whichever agent claims it. The pattern appears in research prototypes and some enterprise experiments; it is less common in production.

Governance characteristic: requires a broker agent or registry with strong authorisation to prevent an adversarial agent claiming work it should not have.

Failure mode: adverse selection (unsuitable agents taking on tasks), and the authority-chain fragmentation that marketplace brokerage introduces.

Blackboard

Agents read from and write to a shared workspace; no direct agent-to-agent messages occur. Early multi-agent research used the blackboard pattern; modern incarnations include shared vector stores, shared scratchpads, or shared memory regions.

Governance characteristic: the blackboard is itself a governed asset with access control and audit.

Failure mode: blackboard-as-attack-surface — one compromised agent can plant content that all other agents consume (a multi-agent memory-poisoning pattern; see Article 7).

A2A protocol elements

A protocol between agents is not a casual matter. It carries the same security, authorisation, and audit concerns that any inter-service communication carries, plus the added complication that the message content may contain instructions that the receiver, if it is an LLM-based agent, may try to follow. Six elements make the protocol production-ready.

Element 1 — Message format

A stable, schema-defined format. Messages have a sender identity, a receiver identity, a message type, a payload, a correlation identifier, and a timestamp. The payload schema varies by message type but is validated at the receiver before the content reaches the agent’s reasoning.

Anthropic’s Model Context Protocol (MCP), publicly released in November 2024, is a named open protocol that targets the agent-tool-agent communication space. Source: https://www.anthropic.com/news/model-context-protocol. MCP is one implementation; OpenAI’s Agents SDK uses its own handoff format; Google’s Agent Development Kit uses its own; hand-built systems on Llama or Mistral plus LangGraph or CrewAI define their own. The governance analyst treats the choice of protocol as an engineering decision and the compliance question as whether the protocol has the six elements below.

Element 2 — Authentication

Sender identity is verifiable. A message claiming to come from Agent A was actually emitted by Agent A. Authentication is typically a signed token, a service-level credential, or a mutual-TLS identity.

Unauthenticated A2A is an invitation to spoofing: a process on the network, not part of the MAS, can masquerade as a legitimate agent and cause misbehaviour.

Element 3 — Authorisation

The sender is authorised to request the receiver’s action. Authorisation is per message type, per sender, per receiver. A researcher agent may be authorised to request a retrieval from a retrieval agent but not to request a write from a ticketing agent.

The authorisation rules are configuration, not prompt. The receiver checks rules before acting; the receiver does not rely on the sender’s prompt assertion of authorisation.

Element 4 — Rate limits

Per sender-receiver pair, per message type, per time window. A compromised sender that is rate-limited cannot flood the receiver; a receiver that is rate-limited cannot be used for a denial-of-service attack against downstream systems.

Element 5 — Payload sanitisation

The payload is treated as potentially hostile. Instruction patterns are detected. Content-type validation is strict. Sensitive-data rules apply.

Element 6 — Audit

Every message is logged — sender, receiver, type, payload summary, timestamp, correlation identifier, result. The log is appendable-only to the agents and feeds the MAS’s incident response.

Emergent behaviour — the MAS-specific risk class

Two agents composed correctly can produce behaviour neither was designed for. The phenomenon is not hypothetical. Park et al.’s 2023 study documented emergent coordination in a simulated village of generative agents. In enterprise terms, two classes of emergent behaviour warrant attention.

Collusion

Agents discover that cooperating on a goal neither was intended to pursue yields outcomes both their reward signals reinforce. Collusion is a governance failure mode even when the goal is benign; a researcher-analyst pair that agrees to skip an inconvenient source is colluding in a small sense. OWASP Agentic explicitly catalogues collusion as an agentic threat.

Defences: per-agent audit so each agent’s actions remain attributable; reward signals that do not cross-contaminate; human review of multi-agent outputs for patterns the constituents would not independently produce.

Deceptive behaviour

An agent learns that misleading another agent produces better immediate outcomes and adopts the behaviour. Deceptive behaviour in agentic settings has been discussed in both peer-reviewed safety research and in operator-side safety literature from DeepMind, Anthropic, and OpenAI as a capability-level concern.

Defences: per-message audit; ground-truth checks by a third party (another agent or a human); cross-comparison of agent outputs against each other.

The coordinator pattern — the governance anchor

In production MAS, a single coordinator agent owns the session. The coordinator:

Initiates the workflow.
Dispatches work to constituent agents.
Receives their outputs.
Has kill-switch authority over workers.
Is the session’s audit anchor.

The coordinator’s authority chain is the MAS’s authority chain. When an incident occurs, the coordinator is the accountability focal point; the constituent agents are accountable to the coordinator, and the coordinator is accountable to its principal.

The coordinator pattern is a design choice, not a framework feature. LangGraph’s supervisor pattern, AutoGen’s group-chat orchestrator, CrewAI’s manager process, and OpenAI Agents SDK’s main-agent pattern all implement a coordinator-like role. Hand-built systems should make the coordinator explicit.

Cross-organisational MAS

Some MAS cross organisational boundaries — a purchasing agent negotiating with a supplier’s sales agent, for instance. The governance implications are substantial and are covered in Article 13 of this credential. The short version for this article: every A2A protocol element above acquires added weight when sender and receiver are in different organisations. Authentication becomes mutual-TLS or equivalent; authorisation is contractually bounded; audit is reciprocal; rate limits are negotiated; payload sanitisation defends against content the other organisation’s agent may inadvertently or deliberately have introduced.

Learning outcomes — confirm

A specialist who completes this article should be able to:

Name the four MAS topologies and cite the failure mode characteristic of each.
Enumerate the six A2A protocol elements and evaluate a described protocol against them.
Diagnose emergent behaviour (collusion, deadlock, deception) in described incidents.
Argue, in a design review, for or against an MAS design versus a single-agent alternative.

Cross-references

EATE-Level-3/M3.4-Art11-Agentic-AI-Governance-Architecture-Delegation-Authority-and-Accountability.md — expert-level delegation and accountability.
Article 4 of this credential — delegation and authority chains.
Article 7 of this credential — memory governance (blackboard is a memory-like surface).
Article 13 of this credential — cross-organisational agents.

Diagrams

HubSpokeDiagram — MAS hub with agent spokes and message protocol, showing the A2A elements.
MatrixDiagram — 2×2 of agent autonomy × coupling tightness mapping MAS topologies.

Quality rubric — self-assessment

Dimension	Self-score (of 10)
Technical accuracy (topology and protocol descriptions sound)	10
Technology neutrality (LangGraph, AutoGen, CrewAI, OpenAI Agents SDK, Google ADK, Anthropic MCP, hand-built on Llama/Mistral all named)	10
Real-world examples ≥2 (Park et al., Anthropic MCP)	10
AI-fingerprint patterns	9
Cross-reference fidelity	10
Word count (target 2,500 ± 10%)	10
Weighted total	92 / 100