Skip to main content
AITE M1.4-Art16 v1.0 Reviewed 2026-04-06 Open Access
M1.4 AI Technology Foundations for Transformation
AITF · Foundations

Compliance-Grade Literacy Evidence

Compliance-Grade Literacy Evidence — Technology Architecture & Infrastructure — Advanced depth — COMPEL Body of Knowledge.

14 min read Article 16 of 48

COMPEL Specialization — AITE-WCT: AI Workforce Transformation Expert Article 16 of 35


Evidence is the part of AI literacy where HR teams learn that the work is not a training programme, it is a records programme. EU AI Act Article 4 places a legal obligation on providers and deployers of AI systems to take measures to ensure a sufficient level of AI literacy among their staff, taking into account their technical knowledge, experience, education, and training, together with the context in which AI systems will be used. ISO/IEC 42001:2023 Clause 7.2 requires the organisation to determine competence for persons doing work that affects the AI management system, and Clause 7.3 requires awareness of the AI policy across the relevant workforce. NIST AI RMF GOVERN 2.2 asks for training records adequate to an organisation’s stated risk tolerance. None of these instruments cares whether the completion dashboard looks attractive. They care whether, in the event of an incident or an audit, the organisation can produce a specific, timestamped, role-mapped record that shows who learned what, when, at what depth, and with what result.

The expert who designs a literacy programme without designing the evidence architecture in parallel has designed a programme that cannot be defended. This article teaches the architecture. It is deliberately placed at the close of Unit 3 because every earlier design decision — taxonomy (Article 12), curriculum (Article 13), delivery (Article 14), measurement (Article 15) — feeds this one. Evidence is not a reporting step at the end. It is a property of the programme that must be designed in from the first curriculum decision.

The five evidentiary asks

Regulators, auditors, and works councils ask different questions, but the underlying asks resolve to five artefact classes. Design the architecture to produce all five by default, not on request.

  1. Population map. Which roles in the organisation sit on which literacy levels? An auditor asks who was required to learn this and why. The answer is a role-to-level map, dated, with the role-inventory source of record and the approving function (typically Head of People × Head of AI Governance, jointly).
  2. Completion records. For each named individual, which literacy modules were completed, at what score, on what date, and against which version of the curriculum? A regulator asks did this person, whose role exposed them to this AI system, have the training the law required.
  3. Assessment integrity. How was the completion assessed? A rubber-stamp attestation is not training. A regulator asks what did the person have to do to pass; an auditor asks is the assessment valid and reliable. Evidence here is the assessment blueprint, the pass-fail distribution, and the item-analysis that shows the assessment discriminates.
  4. Behavioural follow-through. Did the training produce behavioural change in the workflow? A works council asks did anyone actually change how they work, or did they just click through a module. Evidence here is the adoption-signal dashboard from Article 15, wired to the training record so that an auditor can ask for a specific cohort’s adoption trajectory and receive it.
  5. Re-certification cadence and governance. How is the literacy kept current? Regulators ask when does this expire, and what triggers a refresh. Evidence is a written re-certification policy, the cadence per level (annual for AI-worker and AI-specialist is the norm; 24–36 months for AI-user; every significant system change for all), the governance body that approves exceptions, and an expiry dashboard that flags upcoming due dates.

An evidence pack is only as strong as its weakest of the five. A programme that records completion but cannot show assessment integrity fails the same audit as a programme that never recorded completion at all.

System-of-record architecture

Choose a single system of record for literacy evidence and make every other system a mirror. The common pattern is to treat the HRIS — Workday, SAP SuccessFactors, Oracle HCM, UKG, ADP, BambooHR, or open-source OrangeHRM — as the record of who the person is and what their role is, and to treat the LMS or LXP — Docebo, Cornerstone, Workday Learning, SAP SuccessFactors Learning, Degreed, EdCast, LinkedIn Learning, Coursera for Business, Udacity, Open edX, Moodle — as the record of what the person learned and when. A regulator asking for an evidence pack needs a query that joins the two. If the join requires manual reconciliation, the architecture has failed.

The joining key is the employee ID. The joining logic must reconcile across lifecycle events: a learner who transfers between legal entities, who changes name, who takes statutory leave, who is a contingent worker rather than an employee, who is a contractor accessing the LMS through a partner account. Every one of these lifecycle events breaks naïve joins. The expert’s responsibility is to list the lifecycle events in advance and test the join against each.

The evidence schema that travels well across regulators has seven fields per record: learner_id, role_code_at_completion, literacy_level_required, module_id, module_version, completion_date, assessment_score_and_outcome. Every other field (time-in-module, click-stream, browser-fingerprint) is optional and subordinate to these seven. A mature programme also records role_code_history so that a learner’s role at time of training can be reconstructed even after a subsequent transfer.

Role-to-level mapping — the governance artefact regulators ask for first

The role-to-level map is the single most-requested artefact in the three regulator-facing evidence packs we have reviewed (EU AI Office preparatory guidance, the ICO’s AI auditing framework consultation responses, the French CNIL’s enforcement correspondence). It is asked for first because it establishes the universe of obligation. Without it, no other evidence is interpretable.

The map is a table: every role in the role inventory (source of record: HRIS), cross-listed against the literacy level required (source of record: an authority file maintained jointly by Head of People and Head of AI Governance). For each cell, the map records the rationale in one sentence — typically a reference to the AI touchpoints of the role and the EU AI Act risk classification of those touchpoints. The map is version-controlled; the current version is the basis for training obligation; prior versions are retained for retrospective audit.

Three common design failures appear in this artefact:

  • Role-inventory drift. The HRIS role catalogue and the map diverge. New roles are created in the HRIS without being added to the map. Remediation: a monthly reconciliation job that lists roles present in HRIS but missing from the map, and blocks HRIS role-creation without a corresponding map entry.
  • Over-broad level assignment. Every role is assigned AI-user level “to be safe”. The result is a literacy programme the organisation cannot resource, completion rates that collapse, and regulators who ask why so many exceptions were granted. Remediation: a calibration exercise that tests the assignment against actual AI touchpoints per role, documented in the map rationale.
  • Under-inclusion of contingent workers. Contractors, agency staff, and outsourced operators touch AI systems but are omitted from the HRIS and therefore from the map. Remediation: a contingent-worker registry that participates in the map; contractual flow-down of literacy requirements to the staffing partner.

Compliance-grade assessment — what makes a score defensible

A completion record without a defensible assessment is a completion certificate for attendance. The evidence architecture must capture four properties of each assessment: the blueprint (which competencies it tests and at what cognitive depth), the item pool (with item-analysis showing difficulty and discrimination), the cutscore and how it was set, and the invigilation regime (for higher-stakes levels). An assessment passed by 98% of learners on first attempt is almost certainly not discriminating between literacy and non-literacy; it is either too easy or is being gamed. An assessment passed by 40% is almost certainly not the learning gate but the sorting gate, and is a fairness problem as well as an evidence problem.

The defensible range for AI-worker and AI-specialist assessments is a first-attempt pass rate of 60–80%, with a cutscore set through one of the standard methods (Angoff, modified Angoff, bookmark, or Ebel). The expert documents the method and the panel composition; the documentation is part of the evidence pack. For AI-user level, a higher first-attempt pass rate is acceptable (80–92%) because the assessment is targeted at awareness and basic knowledge rather than applied judgment. For general-population level, a mastery-learning design (unlimited attempts with remediation) is defensible as long as the records show the attempts and remediations.

Re-certification — the evidence that does not rot

Most literacy records go stale within 12 months. Models update, internal policies change, new regulatory guidance lands, new systems enter production. A programme that treats literacy as a one-time completion will, within a regulatory cycle, find itself defending records that were accurate at the time but are no longer. The re-certification policy is the antidote. It specifies, per level, the standing re-certification cadence (typical: annual for AI-specialist, annual or 18-month for AI-worker, 24–36 months for AI-user), the event-triggered refresh requirement (a material system change, a material regulatory change, an incident involving the learner’s domain), and the exception governance (who can waive, for how long, on what documented basis).

The expiry dashboard is the operational artefact. It is a report — available to the Head of People, Head of AI Governance, and each Head of Function — that shows, for each population segment, the fraction of staff current on their level, the fraction nearing expiry, and the fraction overdue. The dashboard is the early-warning system. A regulator asking for expiry status will expect to receive a dated extract, not an ad-hoc query.

Works-council scrutiny — a different audience than the regulator

Works councils (Betriebsrat in Germany, comité social et économique in France, ondernemingsraad in the Netherlands) care about evidence, but they care about it for a different reason than the regulator. The regulator cares whether the obligation was met. The works council cares whether the training was proportionate, fair, and imposed no hidden monitoring burden. The expert’s evidence architecture must therefore also produce the proportionality record (total learner hours per role, with comparisons across role families), the fairness record (completion and pass rates by demographic segment, to the extent permitted by local law), and the privacy-impact record (what personal data the LMS collects, how long it is retained, who has access, and whether any inference is drawn about performance from it).

Under the EU works-council directive and national implementations, training programmes above a threshold are consultative matters. An evidence pack delivered after the fact, to a council that was not consulted before, is a poor foundation for the next cycle of engagement. Article 27 of this credential covers the consultation process; the point to carry away here is that the evidence architecture must be legible to the council, not only to the auditor.

Two real-world anchors

The EU AI Act Article 4 obligation and its first year in force

Article 4 of the EU AI Act (Regulation (EU) 2024/1689, in force from 2 August 2025) is the first explicit legal literacy obligation on AI operators in a major jurisdiction. The text places the duty on providers and deployers of AI systems and requires them to take measures to ensure a sufficient level of AI literacy of their staff and other persons dealing with the operation and use of AI systems on their behalf. The Commission’s AI Office has begun publishing preparatory Q&A material, and national supervisory authorities are signalling that they will ask for literacy evidence in enforcement correspondence. Source: https://eur-lex.europa.eu/eli/reg/2024/1689/oj.

The organisations that were early-ready in the opening twelve months shared three properties: a role-to-level map in place before the Act’s entry into force, a literacy curriculum version-locked to a named date (so that a pre-Act completion could be distinguished from a post-Act completion), and a re-certification schedule aligned to the organisation’s AI-system change cadence rather than the calendar year. Organisations that treated Article 4 as an awareness-broadcast problem found themselves unable, twelve months on, to answer the basic evidentiary questions.

ISO/IEC 42001:2023 Clauses 7.2 and 7.3 as audit reference

ISO/IEC 42001:2023 is the international management-system standard for artificial intelligence, published in December 2023 and now adopted by a growing number of organisations seeking independent certification. Clause 7.2 requires the organisation to determine the competence of persons doing work under its control that affects the performance of the AIMS, to ensure they are competent on the basis of appropriate education, training or experience, and to retain appropriate documented information as evidence of competence. Clause 7.3 requires that persons doing work under the organisation’s control are aware of the AI policy and their contribution to the effectiveness of the AIMS. Source: https://www.iso.org/standard/81230.html.

The certification body, during an ISO/IEC 42001 audit, will typically sample literacy records across roles and trace them from the policy obligation through the role-to-level map through the LMS record to the assessment blueprint. The audit is tractable only if the architecture supports trace-through. Organisations that maintain separate spreadsheets for each artefact find themselves spending the audit reconstructing evidence that should have been produced by query.

Putting the architecture on a single page

The expert’s single-page architecture diagram (suggested as a BridgeDiagram in the visualisation set: regulatory clause on the left, evidence artefact on the right, system of record in the middle) is a useful one-pager for board and works-council conversations. Every clause in Article 4, ISO 42001 Clauses 7.2/7.3, NIST AI RMF GOVERN 2.2, and any sector-specific requirement (EBA AI guidance, FCA consultation, US Equal Employment Opportunity Commission guidance on AI in employment) maps to an artefact. Each artefact maps to a system. A missing cell in the table is a gap that will be found by an auditor.

A second diagram (HubSpokeDiagram with the evidence pack at hub and the five artefact classes as spokes) is the executive summary. The hub-spoke is what goes into the board paper; the bridge diagram is what goes into the audit file.

Learning outcomes — confirm

A learner completing this article should be able to:

  • Name the five artefact classes that compose a compliance-grade literacy evidence pack.
  • Describe the role-to-level map as a governance artefact and the three common failure modes.
  • Specify the seven-field evidence schema and the lifecycle events the join must survive.
  • Set a defensible first-attempt pass-rate range for each level and name the cutscore-setting methods.
  • Explain what a works council is asking for and why it differs from the regulator ask.
  • Map an organisation’s literacy obligations under EU AI Act Article 4 and ISO/IEC 42001 Clauses 7.2 and 7.3 to concrete artefacts.

Cross-references

  • EATF-Level-1/M1.6-Art02-AI-Literacy-Strategy-and-Program-Design.md — Core Stream literacy anchor.
  • EATF-Level-1/M1.6-Art01-The-Human-Dimension-of-AI-Transformation.md — root article for workforce transformation.
  • Article 12 of this credential — literacy taxonomy (supplies the levels this article evidences).
  • Article 15 of this credential — measurement (supplies the behavioural-signal dashboard this article cites).
  • Article 17 of this credential — sustainability (operationalises the re-certification cadence).
  • Article 27 of this credential — works-council engagement (extends the council-facing evidence ask).

Diagrams

  • BridgeDiagram — regulatory clause → evidence artefact → system of record, covering EU AI Act Article 4, ISO/IEC 42001 Clauses 7.2/7.3, NIST AI RMF GOVERN 2.2.
  • HubSpokeDiagram — evidence pack at hub, five artefact classes as spokes: population map, completion records, assessment integrity, behavioural follow-through, re-certification governance.

Quality rubric — self-assessment

DimensionSelf-score (of 10)
Technical accuracy (every clause and cadence claim traceable)9
Technology neutrality (multiple LMS, HRIS, and LXP vendors named with parity)10
Real-world examples ≥2, public sources10
AI-fingerprint patterns (em-dash density, banned phrases, heading cadence)9
Cross-reference fidelity (Core Stream anchors verified)10
Word count (target 2,500 ± 10%)10
Weighted total91 / 100