Skip to main content
AITM M1.4-Art52 v1.0 Reviewed 2026-04-06 Open Access
M1.4 AI Technology Foundations for Transformation
AITF · Foundations

Lab: Build a Decision-Rights Matrix for an AI Risk Escalation

Lab: Build a Decision-Rights Matrix for an AI Risk Escalation — Technology Architecture & Infrastructure — Applied depth — COMPEL Body of Knowledge.

9 min read Article 52 of 14

COMPEL Specialization — AITM-OMR: AI Operating Model Associate Lab 2 of 2


Lab brief

You are the AITM-OMR specialist for Haven Health Plans, a fictional composite drawn from US regional health-insurance enterprises. Haven operates an AI-driven claims-review system that flags claims for potential fraud, waste, and abuse. The system assigns each claim a risk score; claims scoring above a threshold are routed to a specialist reviewer, and claims below the threshold proceed to routine processing. The chief operating officer, who is the accountable executive for claims operations, has received a notice from the state insurance regulator. The regulator has asked Haven to describe, within 30 days, the decision-rights architecture that governs the claims-review model — who decided to deploy it, who can override its decisions, who is accountable for its outcomes, and how a claimant whose claim is flagged unfairly can seek redress. Your task is to build the decision-rights matrix that Haven will produce in response.

Lab inputs (summarized)

You have the following evidence at intake:

  • The claims-review system has been in production for 18 months. It was built by Haven’s internal data-science team in collaboration with an external consulting firm and was deployed without an explicit AI-specific governance process at the time.
  • The model classifies approximately 15% of claims above the threshold. Of those, about 40% are eventually confirmed as legitimate (false positives); about 60% are confirmed as fraudulent, wasteful, or abusive (true positives). The false-positive rate has produced some complaints from providers and members.
  • The model was initially deployed with approval from the VP of Claims Operations and the head of Data Science, but there is no written record of a risk classification, an architecture review, or an explicit go-live sign-off.
  • The current organizational structure places the data-science team in the IT function (reporting to the CIO), the claims operations team in Claims (reporting to the COO), the risk and compliance function in a separate group (reporting to the general counsel), and the internal audit function as independent (reporting to the Audit Committee of the board).
  • The EU AI Act does not apply directly to Haven (US-only operations), but US federal and state regulations apply, including NAIC AI model-governance guidance and the state regulator’s specific inquiry.
  • NIST AI RMF is the framework Haven’s risk function has adopted, and an unused AI risk-classification policy was drafted a year ago but never operationalized.

Exercise 1 — Classify the system against risk tier (10 minutes)

Using the tiering approach from Article 5 (EU AI Act-style tiers combined with Haven’s internal risk classification), classify the claims-review system and write a one-paragraph defense.

A candidate classification might read: “High-risk. The system makes consequential decisions affecting access to insurance benefits, uses personal health and claims data, has demonstrated false-positive rates that affect legitimate providers and members, and is subject to regulatory inquiry. Under NIST AI RMF MAP function, the system scores high on impact magnitude, high on impact severity, and moderate on reversibility (incorrect flags can be reversed but at a cost to claimants). A high-risk classification triggers the full four-domain separation-of-duties design and the RAPID-style decision-rights framework.”

Exercise 2 — Identify the decision types (15 minutes)

List the named decisions that the claims-review system requires over its lifecycle. For each decision, name whether it has been made (and when) or whether it is pending.

A complete list typically includes: initial deployment (made 18 months ago, no written record); model version updates (recurring, no written process); threshold adjustment (has occurred at least twice informally); false-positive override (occurs daily in claims operations); claimant dispute resolution (occurs weekly); periodic model-risk review (not yet performed); retraining authorization (pending, since the model has begun drifting on recent claims); retirement decision (not yet triggered).

For each decision, note the risk level — is this a high-risk decision requiring full RAPID, or a lower-risk operational decision that a lighter framework covers.

Exercise 3 — Design the RAPID for model deployment decisions (25 minutes)

For the top-three high-risk decision types from Exercise 2 (typically: deployment, threshold adjustment, retraining authorization), build a RAPID matrix using the roles available in Haven’s organizational structure. For each decision type, name the recommender, the agreers, the input providers, the performer, and the decider. Include a rationale for each role assignment.

A sample RAPID for threshold adjustment might read: “Recommender — head of data science (the technical specialist who analyzes the trade-off between false-positive and false-negative rates and proposes the threshold). Agreers — VP of Claims Operations (whose operational capacity is affected), head of risk and compliance (whose risk assessment must endorse the trade-off), provider-relations VP (whose provider complaints are the false-positive cost). Input — internal audit (reviews proposed change for documentation completeness, does not have agreer role). Performer — data science engineering team (implements the threshold change). Decider — COO (accountable executive for claims operations; holds final authority). Rationale — the RAPID separation distributes recommendation, agreement, and decision across four functions, preventing any single function from controlling the trade-off. The audit function is input rather than agreer because audit’s role is documentation discipline, not operational sign-off.”

Produce the RAPID matrices for all three decision types. Cross-check that no single role holds both recommender and decider for the same decision, and that the agreer set includes the risk function independently.

Exercise 4 — Design the four-domain separation (20 minutes)

For the claims-review system, walk through the four-domain separation from Article 5 (builder, operator, risk owner, sign-off) and assign named roles from Haven’s organization to each domain. Identify any current gaps — places where the real organization today consolidates domains that should be separate.

A sample design and gap analysis might read: “Builder — head of data science and the data-science team (currently in IT, reporting to CIO). Operator — claims operations systems team (currently part of IT operations, also reporting to CIO). Risk owner — head of risk and compliance (currently reporting to general counsel). Sign-off authority — COO for deployment decisions, VP of Claims Operations for operational thresholds. Gap — the builder and operator both report to the CIO, which consolidates two of the four domains. Mitigation — the builder team and operator team have separate leads within IT, and the lead data scientist does not have operational-change authority without VP Claims sign-off; the separation holds at the working level even though the reporting line is unified. Second gap — the 18-month-ago deployment decision was made without the risk owner in the chain, a historical separation failure that the current design corrects.”

Exercise 5 — Draft the accountability matrix (20 minutes)

Produce a one-page accountability matrix for the claims-review system covering: the system itself (one row), the top three risks (three rows), the top three controls (three rows), and the top three outcomes (three rows). For each row, name the single accountable owner (not a committee, a named role).

A sample matrix entry might read: “Risk — disparate impact on specific provider populations. Accountable — head of risk and compliance. Evidence — quarterly disparate-impact analysis report; quarterly review with VP Claims Operations; annual external fairness audit. Escalation — to general counsel if threshold for material disparate impact is exceeded, to COO and Audit Committee if the finding is sustained.”

The matrix is the core artifact the regulator will ask for; producing it in clean form is the specialist’s primary deliverable.

Exercise 6 — Draft the claimant redress decision chain (15 minutes)

For the claimant who disputes a flag, design the decision chain that governs their complaint. Name each step, the role that makes the decision at that step, the documentation required, and the escalation trigger.

A sample redress chain might read: “Step 1 — initial dispute received by Claims Operations. Claims specialist reviews the flag rationale and the claimant’s dispute, can clear the flag if evidence supports clearance. Documentation — written rationale in claim record. Escalation — if specialist cannot clear on first review, proceeds to Step 2. Step 2 — Senior claims reviewer and data-science investigator examine the claim and the model’s reasoning. Can clear the flag, can sustain the flag, can escalate to Step 3. Documentation — written investigation record including the specific model features that drove the flag. Escalation — if sustained and the claimant pursues external complaint, proceeds to Step 3. Step 3 — External review by independent contractor (not Haven employee) or state insurance department intake. Decision is binding on Haven. Documentation — external review produces its own record, filed with Haven’s claims audit trail.”

Exercise 7 — Regulator response summary (10 minutes)

Draft the one-paragraph executive summary Haven will submit to the state regulator. The summary should name the claims-review system, describe its risk classification, state the decision-rights architecture in place (current and corrective), and name the next steps Haven is taking.

A candidate summary might read: “The claims-review system is a high-risk AI system as classified under Haven’s NIST AI RMF-aligned risk framework. Haven has established a formal decision-rights architecture for the system: deployment decisions are governed by a RAPID framework with independent risk, compliance, and provider-relations review; operational threshold adjustments follow the same RAPID with a separate COO sign-off; retraining authorization requires full risk-function review. The four-domain separation of duties between builder (data science), operator (claims operations systems), risk owner (risk and compliance function), and sign-off authority (COO for deployment, VP Claims for operations) is documented and audited. Claimants who dispute a flag have a three-step redress path culminating in independent external review. Haven acknowledges that the original 18-month-ago deployment lacked this formal architecture; a retrospective risk classification and documentation remediation is in progress with completion targeted within 90 days. Haven will provide the state regulator with the full decision-rights matrix, accountability matrix, and redress-chain documentation as supporting evidence.”

Debrief

The lab produces the core deliverable that an operating-model specialist hands to a compliance function when regulatory scrutiny lands: a decision-rights matrix, an accountability matrix, a four-domain separation design, and a claimant-redress chain that the regulator can read and defend. A well-run debrief compares how different learners handled the consolidated builder-operator domain in the IT function (some will argue the consolidation is acceptable with strong working-level separation; others will recommend a reporting-line change) and how aggressively they treat the retrospective deployment gap. The instructor’s most useful feedback addresses defensibility under adversarial review — would the design withstand follow-up questions from a skeptical regulator, or would the first counter-question expose a gap. Decision-rights designs are tested not by their elegance but by their survival under scrutiny.


Q-RUBRIC self-score: 89/100

© FlowRidge.io — COMPEL AI Transformation Methodology. All rights reserved.