Agent, Prompt, Tool, and Memory Registries

FlowRidge

The architect owns the registry design. The platform team owns the implementation. The product teams consume them. This article gives the schema each registry needs, the lifecycle workflow that binds them, and the integration points that keep artifacts from becoming orphans.

Why four registries, not one

A single “artifact registry” is tempting but wrong. The four classes have different review cadences, different approval authorities, different field sets, and different integration surfaces. Agents change rarely and need architecture review; prompts change frequently and need prompt-engineering review; tools change when systems change and need security review; memories are written continuously and need data-governance review. Collapsing them into one forces the lowest-common-denominator workflow on every artifact and yields the bureaucratic registry every team avoids.

The four registries are distinct but interlinked. An agent record references prompts (by ID + version), tools (by ID + version), and memory namespaces. Lineage traces across the four.

Registry 1 — Agent registry

The agent registry is the authoritative inventory of agents deployed in the organization. Minimum fields:

agent_id — stable identifier that survives versioning.
name and description — human-readable.
autonomy_level — L0–L5 per the autonomy spectrum (Article 2).
version — semver (major.minor.patch).
status — draft / in-review / approved / promoted / deprecated / retired.
loop_pattern — ReAct, Plan-and-Execute, Reflexion, state-graph (Article 4).
runtime — LangGraph, CrewAI, AutoGen, OpenAI Agents SDK, Semantic Kernel, LlamaIndex Agents, custom (Article 3).
model — provider + model name + version (e.g., Anthropic Claude 3.5 Sonnet 2024-10-22).
prompt_refs — list of {prompt_id, version} used by the agent.
tool_refs — list of {tool_id, version} authorized for the agent.
memory_refs — list of memory namespaces the agent reads/writes.
owner_team — product team accountable for the agent.
architect — AITE-ATS holder responsible for the reference architecture.
compliance_classification — EU AI Act tier (prohibited / high-risk / limited / minimal), sector regulation applicable (SR 11-7, HIPAA, GxP, etc.).
approval_state — links to Calibrate, Organize, Model, Produce gate decisions (Articles 36–37).
kill_switch_config — reference to kill-switch spec (Article 9).
evaluation_plan_ref — reference to the golden-tasks + adversarial battery (Article 17).
lineage — parent agent (if forked), ancestor versions, promotion history.
lastUpdated + deprecated_at + retired_at.

Approval authority for agent records: architect + security + compliance + platform team. Agents with EU AI Act high-risk classification require additional documented approval.

Registry 2 — Prompt registry

The prompt registry holds the system prompts, task prompts, and prompt templates the agent uses. Prompts change more often than agents, so the prompt registry is the highest-velocity of the four. Minimum fields:

prompt_id and version.
text — the actual prompt content.
purpose — system, task, tool-description, guard, eval.
variables — named slots injected at runtime.
model_compatibility — which models this prompt is tuned for.
authored_by, reviewed_by.
eval_scores — goal-achievement, refusal-rate, injection-resistance on the standard eval battery.
status — draft / approved / active / deprecated.
lineage — parent prompt (if branched), ancestor versions, A/B lineage.
change_rationale — why this version replaces the prior.

Approval authority for prompts: prompt-engineering lead + product-team lead, with security review for prompts that shape authorization or refusal behavior. High-risk deployments require compliance review on prompts that generate customer-facing disclosures (EU AI Act Article 50 matters).

Registry 3 — Tool registry

The tool registry is the source of truth for tools the agent can call. The schema-discipline article (Article 5) covers the schema side; the registry adds the operational fields. Minimum fields:

tool_id and version.
name, description, schema — JSON-schema for parameters.
executor — which service implements the tool.
permissions — scopes required to execute (maps to Article 6 authorization).
side_effect_class — read-only / write-reversible / write-irreversible.
sensitivity — public / internal / confidential / regulated.
data_classes_touched — which PII, PHI, or regulated data categories.
rate_limit — calls per minute per tenant and per agent.
owner — the system owner (not the agent team).
status — draft / active / deprecated / retired.
MCP_manifest_ref — pointer to the Model Context Protocol manifest (if the tool is MCP-exposed — memo §1.3).
deprecation_policy — when deprecated, the window in which existing agents must migrate.
incident_history — link to incidents involving this tool.

Approval authority for tools: system owner + security + data-governance + architect. Irreversible-write tools get a second reviewer and cannot exist without a compensating-transaction plan.

Registry 4 — Memory registry

The memory registry is the least common of the four and the most often missing. It is the authoritative inventory of memory namespaces the agent platform manages. Minimum fields:

namespace_id.
type — short-term (session), long-term (vector store), episodic (trace log), semantic (knowledge graph) — Article 7.
store — concrete backing service (pgvector, Pinecone, Weaviate, Chroma, Redis, Neo4j).
retention_policy — days/months; deletion criteria.
tenant_isolation_mode — row-level security, per-tenant schema, per-tenant cluster.
data_classification — PII, PHI, sensitive commercial, etc.
write_policy — who can write; provenance required per write.
read_policy — who can read; RBAC/ABAC rules.
encryption — at-rest, in-transit.
export_policy — lawful-export procedure (data-subject request under GDPR Article 15/17).
forget_policy — procedure for targeted forgetting (GDPR Article 17 + poisoning remediation — Article 25).
owner_team — who runs the store.
backup_cadence and snapshot_retention — for rollback during incidents.

Approval authority for memory namespaces: data-governance + security + compliance + architect. PII-containing namespaces require DPIA reference.

Lineage — the connective tissue

Lineage is what turns four registries into a platform. Every lookup should be answerable:

Agent → all prompts. Given an agent version, list every prompt (ID + version) it uses.
Prompt → all agents. Given a prompt version, list every agent consuming it. Required to assess blast radius when retiring a prompt.
Tool → all agents. Given a tool version, list every agent with authorization to call it. Required for security-patch rollout planning.
Memory → all agents. Given a memory namespace, list every agent reading and writing. Required for tenant-isolation audits.
Agent history. Given an agent, list its version history with promotion dates, change rationales, evaluation scores.
Incident correlation. Given an incident, list the agent/prompt/tool/memory combination active at the time.

The lineage graph is queryable; the architect’s test is “can you trace an Article 14 human-oversight claim back to specific registry records in under five minutes?” If not, the registry is not meeting its conformity-assessment purpose.

Workflow — approval and promotion

Each registry has a lifecycle workflow with named states and approval authorities. Proposal flow (common pattern):

Draft. Author creates a draft record.
In-review. Approvers notified. Evaluation results attached. Change rationale documented.
Approved. Record can be referenced by agent records but not yet active in production.
Active. Record is in use by at least one promoted agent.
Deprecated. Record is usable by existing agents but not by new ones; migration window opens.
Retired. Record is unreachable; historical audit only.

Cross-registry dependency rule: an agent cannot promote to Active referencing prompts/tools/memory namespaces that are still in Draft or In-Review. This prevents the agent registry from pulling in unreviewed content.

Integration points that prevent orphans

Three integration points keep the registries honest:

Runtime enforcement. The agent runtime (Article 3) refuses to execute if the agent references a deprecated tool without a compatible active version, or if a referenced memory namespace is retired. This is a runtime check, not just a documentation convention — that is what turns registry entries into platform guarantees.
CI/CD integration. Pull requests that change prompts must land in the prompt registry as new versions, not as edits to the approved record. The CI pipeline refuses merges that violate this rule.
Evaluation binding. Promotion of any agent/prompt requires the evaluation harness (Article 17) to have run against the specific version; the evaluation result is attached to the registry record. No orphan evaluation; no unevaluated promotion.

Real-world registry patterns

MLflow Model Registry (open-source). MLflow ships an artifact registry with versioning and stage transitions (staging, production, archived). For agentic systems, MLflow’s model registry is too narrow — it does not model prompts, tools, or memory — but its lifecycle vocabulary (version, stage, transition approval) is the template many agent registries adopt. Teams building on LangGraph or Semantic Kernel frequently layer MLflow-style registries on top.

Hugging Face Hub (open-source). Hugging Face provides model + dataset registries with versioned cards (model cards, dataset cards). Its documentation-card vocabulary translates well to agent cards (autonomy, loop, tools, memory, evaluation). Teams using OpenAI Agents SDK or CrewAI often export agent records to Hugging Face-style cards for documentation portability.

Weights & Biases Models (commercial — named alongside competitors Comet, Neptune). W&B Models adds evaluation lineage: any model promotion carries the evaluation-run IDs that validated it. For agentic systems, the equivalent discipline — any agent promotion carries evaluation-run IDs — is exactly the binding Article 17 asks for.

Anthropic Model Context Protocol server registry. MCP servers publish tool manifests; the platform’s tool registry can hydrate itself from MCP manifests, and MCP version-pinning protects against silent tool-schema drift.

Anti-patterns to reject

“The prompts live in the app repo.” Untracked by the registry, unversioned, invisible to lineage queries. Rejected.
“The tool list is in the YAML config.” Tools without approval workflow, without schema review, without incident history. Rejected.
“We use the vector store, we don’t need a memory registry.” Undocumented tenant isolation, ad-hoc retention, no forget procedure. Rejected.
“One registry, many types.” Lowest-common-denominator workflow on every artifact; fails in practice.
“Registry is documentation-only.” If the runtime does not enforce, the registry is a wish.

Learning outcomes

Explain the four registries — agent, prompt, tool, memory — and the minimum fields each requires.
Classify eight agentic artifacts (a system prompt, a REST tool, a vector-store namespace, a planner agent, a sub-agent, an MCP-exposed tool, a session-memory scratchpad, a knowledge graph) by registry.
Evaluate a registry design for lineage gaps, runtime-enforcement gaps, and approval-workflow gaps.
Design a registry specification for a given agentic platform including schemas, workflows, integration points, and lineage queries.