Skip to main content
E

Stage 5 of 6

Evaluate

Assess the effectiveness of your transformation program through audits, gate reviews, and re-attestation cycles. Determine what is working and what needs adjustment.

Strategic Objective

Conduct comprehensive reviews of KPIs, control performance, adoption metrics, incident/risk indicators, and ROI to determine transformation effectiveness and areas for improvement.

Operational Objective

Execute KPI reviews, control performance audits, adoption assessments, incident analysis, and ROI calculations to produce evidence-based evaluation of all active AI operations.

Evaluate — Stage Flow
  1. Inputs

    • from produce: Deployed Models
    • from produce: Monitoring Instrumentation
    • from produce: Operational Controls and Evidence
    • Evaluation Metric Definitions
    • Red Team Playbook
    • Bias Testing Protocol
  2. Activities (17)

    • Gate Review execution
    • Audit center management
    • Re-attestation triggers and cycles
    • Risk acceptance reviews
    • Governance scorecard assessment
    • Model retirement evaluation
    • Stakeholder validation reviews
    • Benchmarking against success criteria
    • Bias and fairness testing
    • Business value validation
    • Regulatory conformity assessment
    • Internal audit execution
    • Audit preparation and support
    • Agent performance monitoring and trust score tracking
    • Agent behavior drift detection and compliance assessment
    • Vendor performance monitoring and supply chain audit
    • AI-BOM review and supplier maturity assessment
  3. Quality Gate — Gate E

    • Audit complete
    • Gate reviews passed
    • Risk acceptance documented
    • Conformity assessment complete for applicable regulations
    • Compliance evidence collected and verified
    • Regulatory documentation package complete
    • Cross-framework alignment validated
  4. Outputs (13)

    • Gate review decisions and action items
    • Audit findings and remediation plans
    • Re-attestation records
    • Risk acceptance register
    • Transformation effectiveness scorecard
    • Bias and Fairness Testing Report
    • Business Value Validation Report
    • Conformity Assessment Record
    • COMPEL Governance Scorecard
    • Stakeholder Approval Register
    • Agent trust score report and incident review
    • Vendor performance report and supplier maturity scorecard
    • Supply chain audit report with AI-BOM verification
  5. Handoffs

    • Learn: Evaluation reports
    • Learn: Incident logs
    • Learn: Drift findings
    • Learn: Audit findings and gate decisions

Inputs

External inputs (3)

  • Evaluation Metric Definitions

    The success metrics and KPIs established by Calibrate transformation success criteria. Evaluate uses these to benchmark whether the program is delivering the outcomes leadership signed up for.

    NIST AI RMF (Measure function)ISO/IEC 25059
  • Red Team Playbook

    The standard adversarial testing methodology for AI systems. Evaluate uses the playbook to run repeatable red team exercises and to compare results across systems and time.

    MITRE ATLASOWASP LLM Top 10NIST AI RMF Manage 2.1
  • Bias Testing Protocol

    The standard fairness and bias testing methodology. Evaluate uses the protocol so bias assessments are consistent across models and defensible to auditors.

    NIST SP 1270IEEE 7003EU AI Act Article 10

Handoff inputs from prior stages (3)

  • Deployed Models

    from Produce

    The production AI systems handed over from Produce. Evaluate runs gate reviews, conformity assessments, and performance audits against these live deployments.

    COMPEL Stage — Produce
  • Monitoring Instrumentation

    from Produce

    The telemetry, dashboards, and alerts wired up during Produce. Evaluate uses this data feed to compute trust scores, drift signals, and audit evidence without manual collection.

    COMPEL Stage — Produce
  • Operational Controls and Evidence

    from Produce

    The control library and evidence repository produced during Produce. Evaluate maps audit findings and conformity assessments back to these controls.

    COMPEL Stage — Produce

Activities

  • Gate Review execution
  • Audit center management
  • Re-attestation triggers and cycles
  • Risk acceptance reviews
  • Governance scorecard assessment
  • Model retirement evaluation
  • Stakeholder validation reviews
  • Benchmarking against success criteria
  • Bias and fairness testing
  • Business value validation
  • Regulatory conformity assessment
  • Internal audit execution
  • Audit preparation and support
  • Agent performance monitoring and trust score tracking
  • Agent behavior drift detection and compliance assessment
  • Vendor performance monitoring and supply chain audit
  • AI-BOM review and supplier maturity assessment

Outputs & Deliverables

  • Gate review decisions and action items
  • Audit findings and remediation plans
  • Re-attestation records
  • Risk acceptance register
  • Transformation effectiveness scorecard
  • Bias and Fairness Testing Report
  • Business Value Validation Report
  • Conformity Assessment Record
  • COMPEL Governance Scorecard
  • Stakeholder Approval Register
  • Agent trust score report and incident review
  • Vendor performance report and supplier maturity scorecard
  • Supply chain audit report with AI-BOM verification

Key Questions

  • ? Are we meeting our transformation objectives?
  • ? What audit findings need remediation?
  • ? Are stage transitions proceeding as planned?
  • ? What is our current governance maturity score versus the baseline?

Gate / Exit Criteria

  • KPI review completed for all deployed systems
  • Control performance audit completed with findings documented
  • Adoption review shows trends against targets
  • Incident and risk review completed with categorized findings
  • ROI calculation completed for each active use case
  • Gate E review passed with decision recorded

Articles from the Body of Knowledge that are tagged to the Evaluate stage or are lifecycle-wide and apply here.

See all 164 related articles →

Cross-Cutting Concerns