Agentic Platform Design

FlowRidge

COMPEL Specialization — AITE-ATS: Agentic AI Systems Architect Expert Article 20 of 40

Thesis. The first agent an organization builds costs a lot and takes a long time. The second agent is slightly easier. The third should be an order of magnitude easier than the first, and the tenth should feel routine. That progression does not happen by accident. It happens because an architect deliberately builds an agentic platform underneath — shared runtime, shared tool registry, shared memory services, shared safety layer, shared observability, shared evaluation infrastructure — so product teams compose new agents from platform services instead of re-implementing them. This article specifies the platform/product split, the capabilities the platform owns, and the runway roadmap that sequences what gets built when. It is the counterweight to Article 3’s build-your-own example: here we do it at scale and the raw-model reference is consumed by the platform, not re-authored per team.

Platform versus product — the split

A platform is the set of capabilities that any agent needs and that nobody benefits from re-implementing. A product is the specific agent a team builds for a specific use case. The split follows the “do it twice, extract a platform” rule but with a stricter version for agentic: do it zero times naively, design the platform first.

Platform responsibilities:

Agent runtime (the substrate hosting the loop — Article 3).
Tool registry (source of truth, MCP-backed — Article 5).
Policy engine integration (Article 22).
Authorization + validation services (Article 6).
Memory services (vector, graph, episodic — Article 7).
Safety primitives (guardrails, sanitizers, classifiers — Articles 8, 14).
Kill-switch controller (Article 9).
Observability + tracing stack (Article 15).
Evaluation harness (Article 17).
Cost management and FinOps (Article 19).
Registries for agent, prompt, tool, memory (Article 26).
Sandboxing service (Article 21).
Incident-response runbooks and tooling (Article 25).

Product responsibilities:

The use case itself (what task this agent performs).
The task-specific prompts (within platform-provided templates).
The task-specific tool selection from the registry.
The task-specific evaluation dataset.
The product UI / integration surface.
The operational ownership (on-call for this agent).

The boundary is fuzzy in practice; the architect draws it and documents it. Over time capabilities migrate from product to platform as they prove useful to multiple teams.

Ten capabilities classified

An architect tasked with building the first platform-vs-product decision list classifies ten recurring capabilities.

LLM API orchestration (retries, failover, rate-limit handling) — platform. Every agent needs this; no product should re-implement it.
Agent loop implementation — platform. Framework-level but wrapped in a platform abstraction that handles policy hooks, observability, and kill-switch.
Tool definition and registration — platform (registry is platform; individual tool implementations may live in product or platform depending on scope).
Prompt template management — platform (registry is platform; product supplies the task-specific content).
Evaluation dataset management — platform (harness infrastructure) + product (the datasets themselves).
Guardrails and classifiers — platform (one classifier stack serves all agents).
Memory stores — platform (shared infrastructure with per-agent namespaces).
HITL queue and reviewer UI — platform (reviewer experience unified across agents).
Audit log — platform (one queryable log; per-agent filtering).
Domain-specific business logic (e.g., refund-amount calculation, credit scoring) — product. This is the agent’s differentiated value.

The platform-runway roadmap

Building a platform at once is not how it works. The runway roadmap sequences capabilities over quarters so the platform grows with demonstrated demand.

Phase 1 — MVP (first agent in production)

Build the minimum to get one agent live: agent runtime, basic tool registry (maybe just a code file), basic authorization (per-tool manual checks), basic observability (OTel emission to existing APM), basic evaluation (one golden-task battery), manual kill-switch.

First agent pays for the platform’s first investments. Duration: 3–6 months for an organization starting from zero.

Phase 2 — Shared platform (second and third agents)

Second agent triggers platform investments: formal tool registry with MCP, centralized policy engine, shared memory service for Layer 2, unified guardrail stack, HITL queue with basic UI. Third agent proves the platform amortizes — it builds in weeks, not months.

Duration: 6–12 months after Phase 1.

Phase 3 — Mature platform (tenth+ agent)

Platform now supports many products: comprehensive registries for all four artifact types (agent, prompt, tool, memory), sophisticated cost attribution, self-service onboarding for new product teams, platform-team-owned SLO for platform services (uptime, latency of auth, memory-store availability), template agent configurations for common patterns (customer service, research, code), internal consulting capacity for product teams.

Duration: ongoing; the platform is now a product itself with its own roadmap.

Governance of the platform

An agentic platform is a shared service; its governance is non-trivial.

Platform ownership. A named platform team owns the runtime, registries, policy engine, and shared services. Responsibilities include: uptime, backward compatibility, security, cost management, roadmap.

Product interface contract. Products interact with the platform through documented APIs and patterns. Breaking changes require deprecation notice and migration support. The platform team is accountable for stability.

Onboarding process. New product teams go through an onboarding that teaches the platform’s patterns, registry formats, safety requirements, and evaluation expectations. Without a process, every team re-invents.

Exception process. Products needing capabilities the platform doesn’t yet provide go through a documented exception process — either the platform team builds it, or the product builds a bespoke solution with a plan to contribute back or retire.

Cross-platform collaboration. In large organizations with multiple agentic platforms (rare but not unheard of), architects coordinate to avoid duplicate investment.

Build, buy, integrate — the platform layer

Each platform capability has build/buy/integrate options (Article 39 covers in depth).

Runtime — LangGraph/OpenAI Agents SDK/build-your-own on raw model calls.
Tool registry — MCP-backed OSS + custom metadata overlay; commercial options emerging.
Policy engine — OPA, Cedar, or commercial (Styra, Permit.io); all appropriate.
Memory — pgvector, Pinecone, Weaviate, Qdrant, Neo4j; no dominant choice.
Guardrails/sanitizer — Rebuff, Lakera, LLM Guard, Azure AI Content Safety, custom.
Observability — Langfuse, Arize, LangSmith, OTel stack, Datadog, W&B.
Evaluation — Braintrust, Humanloop, OSS harness, Arize Phoenix.
Sandboxing — E2B, Firecracker, gVisor, Docker.

A typical mature platform is a composition of OSS and commercial pieces with custom glue. Full-stack agentic platform products (Agentforce, Copilot Studio) exist; for organizations that want to ship fast rather than differentiate, these are legitimate choices.

Multi-tenancy at the platform layer

If the platform serves multiple tenants (internal customer tenants; external customers; internal business units), tenant isolation is platform-level, not product-level.

Tenant namespace per tool registry entry (tool is tenant-scoped or global).
Tenant namespace per memory store (every read and write carries tenant context).
Tenant attribution in authorization, audit, cost.
Tenant-specific policy sets (multi-regime support).
Per-tenant evaluation datasets.
Per-tenant feature flags for rollout control.

Multi-tenant platforms can serve dozens to hundreds of agent configurations across tenants; without platform-level isolation they cannot.

The build-your-own reference at platform scale

Article 3 demonstrated a build-your-own agent runtime on raw Llama 3 without a framework. At platform scale the same pattern is used deliberately for organizations whose scale, security posture, or air-gapped operations preclude framework dependencies. The platform wraps raw-model calls with its own loop, tool-call, authorization, and observability layers. Advantages: maximum control; no framework churn; can deploy to any environment. Disadvantages: higher engineering investment; team must track agentic research to keep current.

A representative organization that takes this path: a large bank operating in a sovereign-cloud or on-prem environment, using self-hosted Llama 3 or Mistral, building its agentic platform on Kubernetes with its own runtime. LangGraph or CrewAI sit alongside as optional tools, but the platform doesn’t require them; tools are MCP-format and portable.

Platform anti-patterns

Anti-pattern 1 — Platform-as-framework-wrapper. The “platform” is just LangGraph + some utilities. Nothing stops a team from forking off; no registries, no policy engine, no shared memory. Net benefit: small.

Anti-pattern 2 — Platform-without-team. Someone built the platform once; nobody owns it; it atrophies. Products drift back to their own implementations.

Anti-pattern 3 — Platform-as-gatekeeper-only. The platform exists only to say no; product teams route around it; shadow IT emerges. The platform team earns legitimacy by providing value, not by blocking.

Anti-pattern 4 — Premature over-engineering. Before any agent is in production, the team spends 12 months building a perfect platform nobody uses. MVP first; extract patterns from real usage.

Anti-pattern 5 — Platform scope creep. The platform tries to own product logic. Quality suffers; velocity slows; product teams lose autonomy and frustration builds.

Real-world anchor — Salesforce Agentforce

Salesforce’s Agentforce (2024) is the canonical example of a commercial agentic platform bundled with a major SaaS product. Agentforce provides agent runtime, tool registry (backed by Salesforce’s data), safety features, and observability as a platform, with customers configuring product-level agents on top. The pattern — platform-as-product — is one of the two dominant commercial models. Source: salesforce.com/agentforce.

Real-world anchor — Microsoft Copilot Studio

Microsoft Copilot Studio (2024) takes a similar platform approach with tighter integration to Microsoft 365 data and Azure AI services. Copilot Studio’s topics-and-actions abstraction is a different product-level vocabulary from Agentforce but serves a similar platform role. Architects evaluating buy-vs-build against these products measure the platform capabilities list above. Source: copilotstudio.microsoft.com.

Real-world anchor — Build-your-own OSS stack

A commonly published build-your-own reference: LangGraph or Semantic Kernel as runtime + OpenTelemetry for traces + OPA for policy + pgvector for memory + E2B for sandboxing + Rebuff for sanitization + MLflow for registries. The stack is entirely OSS, deployable in any environment, and has been documented in multiple engineering blogs in 2024–2025 as the reference architecture for teams that won’t adopt a commercial platform. The architect can build this stack from published references in an estimated 3–4 engineer-quarters.

Closing

Platform versus product, ten capabilities classified, three phases of runway, five anti-patterns. The architect’s job is to build the platform that makes each subsequent agent cheaper than the last. Article 21 begins the platform deep-dive with sandboxing — the execution-isolation layer that every code-executing agent needs and that platforms own.

Learning outcomes check

Explain the agentic platform/product split and the ten capabilities that typically fall on each side.
Classify ten capabilities by platform vs product ownership, with justification rooted in reuse and differentiation.
Evaluate a platform design for coupling (is product logic leaking into platform?), ownership, exception process, and multi-tenant isolation.
Design a platform-runway roadmap for a given organization including MVP scope, shared-platform investments, and maturity milestones.

Cross-reference map

Core Stream: EATE-Level-3/M3.3-Art11-Enterprise-Agentic-AI-Platform-Strategy-and-Multi-Agent-Orchestration.md; EATE-Level-3/M3.4-Art14-EU-AI-Act-Article-6-High-Risk-Classification-Deep-Dive.md.
Sibling credential: AITF-PLP Article 8 (platform-engineering angle); AITM-AAG Article 14 (platform governance).
Forward reference: Articles 21 (sandboxing), 22 (policy engines), 26 (registries), 27 (security architecture), 39 (build vs buy), 40 (capstone platform).