Template 01: Prompt Registry Entry and Test Plan

FlowRidge

AITM-PEW: Prompt Engineering Associate — Body of Knowledge Artefact Template 1 of 1

How to use this template

Copy this template for each production prompt your team operates. Fill in each section. Where a section does not apply to your feature, state so explicitly and give the reason; do not delete the section. The template is designed to match the prompt-registry schema in Article 9, the harness design in Article 8, and the regulatory checklist in Article 10. A completed template is the minimum documentation a regulator, an auditor, or an incident responder will expect to see.

Sections marked [required] block publication if not completed. Sections marked [recommended] produce a review-time flag if omitted.

1. Identity `[required]`

Prompt identifier: <stable-slug-that-outlives-name-changes>
Prompt title: <short-human-readable-title>
Version: <major.minor.patch> (semantic; see Article 9 rules)
Created: <yyyy-mm-dd>
Last updated: <yyyy-mm-dd>
Status: draft | canary | active | deprecated | archived

2. Ownership `[required]`

Primary owner: <name, role, contact>
Backup owner: <name, role, contact>
Technical reviewer: <name, role>
Governance reviewer: <name, role>
Security reviewer: <name, role> (required if feature touches external users, regulated data, or tool-layer actions)
Product / legal reviewer: <name, role> (required for customer-facing or financially material features)

3. Intended use and boundary `[required]`

Feature name: <the product feature this prompt drives>
Intended use statement: one paragraph describing what the prompt does, for whom, and under what conditions.
In-scope questions / tasks: enumerated list.
Out-of-scope questions / tasks: enumerated list; the prompt should refuse these explicitly.
User population: who can invoke this feature (authenticated employees; customers; public; API-only consumers).
Risk tier: low | moderate | high | very-high per organisational risk framework; link to the risk assessment.

4. Prompt composition `[required]`

System instruction: either inline or linked to the versioned file in the source repository.
Few-shot examples: count and, if applicable, a summary of the demonstration pattern; linked to the versioned file.
Retrieval binding: name of the retrieval source; version; indexed content summary; update cadence.
Tool schemas: list of tool names the model may invoke; for each, link to the schema file and the permission envelope (Article 5).
User input contract: expected input shape, length bounds, language coverage.
Output contract: expected output shape (free text, structured JSON, function-call payload); if structured, link to the schema; confidence-indicator convention; citation convention for RAG features.

5. Model bindings `[required]`

Primary model: provider, model name, specific version approved. Examples: a managed closed-weight model (OpenAI, Anthropic, Gemini, Mistral managed), a cloud-hosted model (via AWS Bedrock, Azure AI Foundry, GCP Vertex), or a self-hosted open-weight model (Llama, Mistral open weights, Qwen, DeepSeek). Name the specific version; latest is not acceptable.
Fallback model(s): provider and version for cases where the primary is unavailable.
Parameter settings: temperature, top-p, max tokens, stop sequences, response format (JSON mode, strict schema, grammar).
Cost budget: declared maximum tokens per request and dollars per thousand requests.
Latency budget: declared p50 and p95 targets.

6. Guardrail bindings `[required]`

Declare the platform-level controls wrapping this feature. Name the specific product, service, or library at each layer; alternatives exist at every layer and the team’s choice is documented here.

Input classifier: name (e.g., Llama Guard, NeMo Guardrails, Guardrails AI, Azure AI Content Safety, Amazon Bedrock Guardrails, OpenAI Moderation, Gemini safety filters, or a team-built alternative); configuration summary.
Output classifier: name and configuration.
Retrieval sanitisation: process for detecting instruction-shaped content in indexed documents; owner.
Tool-call validator: orchestration-layer component enforcing the permission envelope; owner.
Rate limiting and abuse prevention: per-user and per-tenant limits.
Audit logging: what is logged, where, for how long.

7. Evaluation plan `[required]`

Declare the test-case populations, scoring methods, thresholds, and cadences for each of the six harness dimensions (Article 8).

Dimension	Test population	Scoring	Threshold	Cadence	Failure consequence
Correctness
Grounding
Safety (adversarial resistance)
Style
Stability
Cost

Offline harness location: link to the harness repository and the current test-case set.
Online evaluation sampling rate: declared percentage of production traffic scored online.
Harness product: if using a commercial or open-source product, name it (Arize, Langfuse, Weights & Biases, MLflow, Humanloop, WhyLabs, LangSmith, Promptfoo, or a team-built alternative).
Cadence schedule: pre-commit triggers, pre-deployment gate, weekly full run, monthly adversarial sweep, on-vendor-event trigger.

8. Change control `[required]`

Current change rationale: why this version was produced; what problem it solves or which test it addresses.
Diff from previous version: link to the code-review record.
Harness result: pass/fail per dimension; link to the run.
Approval chain: reviewer names and sign-off timestamps.
Deployment plan: canary fraction, canary duration, rollout trigger, rollback trigger.

9. Regulatory and compliance posture `[required]`

EU AI Act Article 50 applicability: is this feature interacting directly with natural persons? Is the AI nature obvious? Is the disclosure in place?
EU AI Act high-risk classification: does this feature fall under Annex III high-risk categories? Link to the classification analysis.
NIST AI RMF GOVERN alignment: which GOVERN subcategories apply; link to the policy documents.
ISO 42001 Clause 7.5 documented information: confirm that the registry entry, harness results, approval records, and incident log are linked and accessible.
Sector-specific regulation: GDPR, HIPAA, FINRA, PCI-DSS, or other applicable frameworks.
California AB 2013 / SB 942 applicability: does the feature reach California users? If yes, has the training-data summary and AI-detection tool duty been routed to the compliance team?

10. Incident-response `[required]`

Runbook: link to the on-call runbook specific to this feature.
Primary pager: on-call rotation; named role.
Escalation path: named roles for safety incidents, data incidents, legal incidents.
Disclosure duties: who decides on public disclosure; who drafts; legal and communications review.
Retention: logs retained how long; evidence artefacts retained how long.

11. Related artefacts `[recommended]`

Prompt registry record URL: link to the registry entry.
Source repository path: where the prompt, few-shot set, and schemas live in version control.
Feature documentation: user-facing docs or internal wiki.
Related prompts: if this feature composes multiple prompts or shares components with other features, list them.
Related tools: link to tool-schema files and their permission envelopes.
Cross-references to this credential’s articles: Articles 1-10 of AITM-PEW that the design draws on.

12. Deprecation plan `[recommended]`

When the prompt is superseded or the feature is retired, complete this section before archiving:

Deprecation date:
Replacement prompt (if any):
Migration evidence: harness results showing the replacement meets or exceeds this prompt’s thresholds.
User-communication plan: how users learn of the change.
Data-retention plan: how long logs and evidence artefacts persist after deprecation.

Completed-template checklist

Before the prompt moves from draft to canary, verify:

All [required] sections are complete.
Model binding names a specific version, not latest.
Guardrail bindings name specific products or team-built components at every layer.
Evaluation plan lists threshold, scoring, and cadence for each of the six dimensions.
Change control section links to the harness run, the diff, and the approvers’ sign-offs.
Regulatory section addresses Article 50 disclosure.
Incident-response runbook is linked.

Before the prompt moves from canary to active, verify:

Canary-window harness results pass all thresholds.
Online evaluation is configured and producing signal.
No drift alerts were triggered during canary.
Rollback procedure was tested at least once.

Before the prompt moves from active to deprecated, verify:

Deprecation plan is complete.
Replacement prompt is live and meeting thresholds.
Users have been informed or the change is transparent to users.

Notes for adoption

This template is intended to be used as a living document. Teams can extend it with additional sections (privacy classification, data-residency posture, vendor contract references, cost-accounting tags) appropriate to their context. Teams should not, without explicit governance sign-off, reduce the template below the [required] sections listed above; each has been sized against the minimum a regulator, auditor, or incident responder reasonably expects.

For teams adopting this template for the first time: start by backfilling entries for the highest-volume and highest-stakes prompts already in production, rather than attempting to template every prompt at once. Backfilling surfaces the documentation gaps that matter most and produces the evidence of governance maturity that a first-time external review will look for.

How to use this template

1. Identity [required]

2. Ownership [required]

3. Intended use and boundary [required]

4. Prompt composition [required]

5. Model bindings [required]

6. Guardrail bindings [required]

7. Evaluation plan [required]

8. Change control [required]

9. Regulatory and compliance posture [required]

10. Incident-response [required]

11. Related artefacts [recommended]

12. Deprecation plan [recommended]