Healthcare is the sector most likely to produce the phrase “the model gave me the wrong answer and I didn’t catch it.” The architectural response is not to hope the model is right; it is to design so that the clinician’s judgment is structurally retained and the agent’s role is structurally bounded.
The regulatory backbone
HIPAA (US). Privacy Rule, Security Rule, and Breach Notification Rule. Defines covered entities, business associates, minimum-necessary standard, patient access rights, and the penalties for breach. Agents processing PHI sit inside the business-associate relationship and must comply.
GDPR special-category data (Article 9). Health data is special-category; processing requires a valid Article 9(2) lawful basis in addition to the general Article 6 basis. Consent or public-health basis is common; architecture must record which.
EU Medical Device Regulation 2017/745 (MDR). Software intended to provide information for medical decision-making is regulated as a medical device. MDR Class IIa or higher applies to many clinical-decision-support systems — including agentic ones — with conformity assessment and notified-body involvement required.
FDA software-as-a-medical-device guidance. FDA’s ongoing framework for AI/ML-enabled medical devices (2021 Action Plan; 2023 “predetermined change control plan” guidance) accommodates model-update lifecycles; the architect designs the agentic update lifecycle to fit.
GxP. Good practices for clinical trials (GCP), laboratory (GLP), manufacturing (GMP). Applies to agentic systems in clinical-research operations, lab-workflow support, or manufacturing; requires validated systems, audit trails, and change control.
Five sector-specific architecture patterns
Pattern 1 — Clinical-decision-support gating with clinician confirmation
Any agent producing advice that informs a clinical decision (diagnosis, treatment, dosing, triage) runs with a gated design: the agent produces a recommendation; the clinician reviews; the clinician confirms or overrides; the EHR records both the recommendation and the clinician’s final decision.
This pattern preserves clinical judgment structurally — not by hoping the clinician reads carefully, but by making the confirmation a required step in the workflow. Systems that rubber-stamp fail this pattern.
Pattern 2 — Audit-trail completeness for regulated decisions
Healthcare audit expectations exceed what generic IT audit provides. For an agentic clinical-decision-support system:
- Every agent recommendation is logged: agent version, prompt version, model version, inputs (with minimum-necessary redaction), evidence retrieved, recommendation, clinician’s decision, outcome if captured.
- Retention: per jurisdiction + specialty (often 7–10 years for adult records; 21+ years for paediatric records in some jurisdictions).
- Patient-access path: the audit trail is part of the medical record and accessible to the patient under HIPAA right of access or GDPR Article 15.
- Reconstructability: a clinician reviewing the case 18 months later can reconstruct why the agent recommended what it did.
Pattern 3 — Minimum-necessary PHI in context
HIPAA Minimum Necessary is a rule about PHI access; the architect applies it to context-window composition. The agent receives the minimum PHI needed for the task — not the patient’s full record. Pre-context filters strip fields not required for the specific task.
For research agents subject to 42 CFR Part 2 (US substance-use disorder records), consent-specific access rules are enforced at retrieval time; the agent never sees Part 2 records for which consent is absent.
Pattern 4 — Change control under MDR / FDA predetermined change control
Model and prompt updates in a regulated clinical-decision-support system are not free. Under MDR, changes that affect intended use or performance require notified-body review. Under FDA’s predetermined change control plan guidance, a device maker can pre-declare certain change categories the system may undergo without new submission; the architect collaborates with regulatory affairs to define this plan at design time.
The registry design (Article 26) carries change categorization in the agent record so the promotion workflow routes material changes through regulatory review.
Pattern 5 — Contraindication and safety-signal surfacing
For clinical-decision-support agents, the safest useful output is one that makes the clinician aware of relevant contraindications, drug-drug interactions, and safety signals — alongside the recommendation. The pattern forces the agent to explicitly surface concerns rather than embed them in narrative. The clinician’s workflow is a check-the-box confirmation that they have seen the contraindications.
Three healthcare use cases, classified
Use case A — Clinical-note summarization and coding assistant (Nabla Copilot, Abridge-style).
- Regulatory tier: varies; in US if used for billing/coding it enters CMS-relevant workflow; in EU if the summary informs clinical decisions it may be MDR-regulated.
- Architecture: ambient-transcription agent; clinician reviews and confirms the note; all edits logged; minimum-necessary PHI in context; integration with EHR; audit trail for billing-relevant claims.
Use case B — Radiology triage agent (second-reader style).
- Regulatory tier: MDR Class IIa or higher; FDA Class II with 510(k) clearance typical; high regulatory burden.
- Architecture: agent produces triage priority; radiologist is the decision-maker; clinician confirms or overrides; performance monitored with sensitivity/specificity against ground truth; change-control plan for model updates.
Use case C — Medication reconciliation and contraindication agent.
- Regulatory tier: MDR-regulated if producing clinical advice; HIPAA-regulated in US.
- Architecture: agent reviews medication list, flags interactions and duplicates, surfaces contraindications; pharmacist/clinician confirms before prescribing changes; structured evidence for each flagged item; integration with drug-information database (with curation by internal pharmacy).
Clinician-oversight design
The Article 14 human-oversight design for healthcare agents demands specifics:
- Preserved judgment. The clinician’s decision, not the agent’s recommendation, drives downstream action.
- Override reasoning. When the clinician overrides, the workflow captures the reason — this is the highest-value training signal and regulatory evidence.
- Confidence calibration. The agent’s recommendations carry calibrated confidence; consistently low-confidence cases go to a different path (request for additional tests, senior consultation).
- Fatigue-aware design. Alerts fatigue is a documented clinical-decision-support failure mode; the architect limits alert frequency and tunes thresholds with clinician input.
- Override-rate monitoring. A low override rate may indicate rubber-stamping; a high override rate may indicate miscalibrated recommendations; the architect watches this ratio.
Framework selection in healthcare contexts
Healthcare architects applying the Article 39 build-vs-buy framework face distinctive constraints:
- BAA (Business Associate Agreement) availability. In US deployments, any model provider touching PHI must execute a BAA. As of 2025 Anthropic, OpenAI (enterprise + API with specific configurations), Google (via Vertex AI with specific configurations), and Microsoft (via Azure OpenAI) offer BAA-covered services — the architect verifies current status per use case and documents in the ADR.
- Sovereignty for EU deployments. GDPR special-category data and national healthcare rules often require EU processing; Anthropic, OpenAI and Google provide EU regions with varying maturity; Mistral (French) is a sovereignty-motivated option.
- On-premise or VPC deployments. For hospitals with strict data-control policies, self-hosted inference (vLLM + open-weights models like Llama 3, Mistral, or purpose-tuned medical models) may be preferred despite the operational burden.
- Framework maturity in healthcare. LangGraph and Semantic Kernel both appear in healthcare production deployments; CrewAI and AutoGen are less common in regulated clinical use. OpenAI Agents SDK is growing in healthcare-admin contexts.
- Integration with EHR. Epic’s AppOrchard / Showroom, Oracle Health’s interoperability tools, and Cerner’s open-platform interfaces are where agentic tools actually live. The architect designs for these integration surfaces, not for generic integration.
The architect balances clinical-team familiarity with these options against operational, regulatory, and cost constraints.
Real-world anchors
Epic AI integration (public documentation, 2023–2024). Epic publicly documents AI integrations with the EHR covering draft note generation, inbox triage, and chart summary. Patterns: clinician-in-the-loop; EHR-native audit trail; enterprise configuration of model endpoints; draft-then-confirm flows.
Nabla Copilot for clinicians (public architecture posts). Nabla describes an ambient-AI scribe with transcription + note generation; the clinician always reviews and edits before signing; zero-retention mode for PHI in transit; configurability for specialty-specific templates.
MHRA (UK) software-as-a-medical-device guidance and FDA pre-market clearance precedents. Authoritative reference for how agentic clinical-decision-support is regulated.
Research and life-sciences variants
Life-sciences R&D agents. Drug-discovery agents (literature review, target identification, hit triage) are generally lower-regulatory-burden than clinical-decision-support but may enter GxP scope if they produce outputs that inform regulatory submissions.
Clinical-trial-operations agents. Agents supporting trial startup, enrollment monitoring, or protocol-deviation review are subject to GCP; the architecture inherits audit-trail, validation, and change-control expectations.
Lab-workflow agents. Agents in labs supporting sample triage, result interpretation, or QC review are in GLP scope; audit trail and data-integrity (ALCOA+) expectations apply.
Anti-patterns to reject
- “Clinician reviews” (without workflow enforcement). Review that can be skipped is review that will be skipped.
- “Full patient record in context.” Violates Minimum Necessary; increases PHI exposure; often not even useful to the model.
- “We’ll update the model quarterly.” Without predetermined change control or notified-body engagement, model updates may require new submissions.
- “The agent’s confidence is the clinician’s confidence.” They are not; the architect forces calibration evidence before promotion.
- “Audit log lives only in our platform.” Clinical audit must be in the medical record; in-platform-only logs fail legal-hold requests.
Learning outcomes
- Explain healthcare agentic architecture constraints — HIPAA, GDPR Article 9, MDR, FDA software-as-a-medical-device, GxP.
- Classify three HC use cases by regulatory tier and map each to clinician-oversight patterns.
- Evaluate an HC design for clinician-oversight completeness, minimum-necessary compliance, and change-control adequacy.
- Design a healthcare agentic plan including the clinical-decision-support gate, the audit trail, the override-rate monitoring, and the predetermined change-control plan.
Further reading
- Core Stream anchors:
EATE-Level-3/M3.4-Art14-EU-AI-Act-Article-6-High-Risk-Classification-Deep-Dive.md. - AITE-ATS siblings: Article 6 (authorization), Article 10 (HITL), Article 23 (EU AI Act + MDR), Article 24 (lifecycle + change control).
- Primary sources: HIPAA Privacy Rule (45 CFR Part 164); EU MDR 2017/745; FDA AI/ML Software as a Medical Device Action Plan (2021); Epic AI integration documentation; Nabla Copilot public architecture posts.