Skip to main content
AITE M1.4-Art04 v1.0 Reviewed 2026-04-06 Open Access
M1.4 AI Technology Foundations for Transformation
AITF · Foundations

Role Exposure Scoring

Role Exposure Scoring — Technology Architecture & Infrastructure — Advanced depth — COMPEL Body of Knowledge.

13 min read Article 4 of 48

COMPEL Specialization — AITE-WCT: AI Workforce Transformation Expert Article 4 of 35


A workforce strategy team publishes a heat-map of AI exposure across the organisation’s 46 role families. Roles with an exposure score above 0.7 are shaded red; the shading generates immediate anxiety in the affected populations and immediate enthusiasm in the cost-reduction-focused members of the executive committee. A week later a frontline manager asks the team a simple question: what does a score of 0.71 mean that a score of 0.69 does not? The team cannot answer. The methodology was sound at the task level and attractive at the aggregation level, but the cut-off at 0.7 was a visual choice, not an evidentiary one. Role exposure scoring is the most misused quantitative tool in workforce transformation because it produces a number that looks like a verdict when it is actually an input. This article teaches the expert practitioner to produce exposure scores with defensible methodology, to aggregate them without false precision, to identify the methodological limits, and to communicate scores so that they inform rather than decide.

Why scoring exists at all

Exposure scoring exists because role-level intuition about AI impact is unreliable. Practitioners who rely on informed guesswork systematically over-weight visible roles and under-weight roles that are high-exposure but low-visibility to the executive team. The WEF Future of Jobs Report 2025 documents how organisations consistently mis-estimate which roles are most affected when they do not conduct task-level analysis.1 The OECD’s AI, Data, and the Future of Skills working papers provide comparable evidence at the national economy level.2 Scoring does not replace judgment; it disciplines it.

The discipline is methodological. Two methodologies dominate the published literature. The first is the Eloundou et al. 2023 approach, in which tasks are scored for LLM exposure using human raters and GPT-4 as a rater, producing task-level exposure estimates that can be aggregated to occupations.3 The second is the ILO approach from Gmyrek, Berg, and Bescond’s 2023 working paper (and 2024 update), which uses a slightly different exposure rubric oriented to global labour markets and includes a quality dimension alongside quantity.4 Both methodologies trace to earlier Frey and Osborne 2013 automation-exposure work, which remains a useful reference for the limits of occupation-level rather than task-level analysis.5 Expert practice uses Eloundou or ILO as the backbone and adapts the rubric to the organisation’s specific work patterns.

The five-step scoring method

A defensible scoring process follows five steps. Each step has methodological choices the practitioner must make explicitly.

Step one — role decomposition. The role is decomposed into constituent tasks. The O*NET task dictionary is the most commonly used starting point in the United States; ESCO provides a comparable European taxonomy.67 A knowledge-worker role typically decomposes into 20 to 50 tasks. Decomposition is done with the incumbents in the role, not only from the job description, because job descriptions systematically under-count coordination, judgment, and knowledge-management work.

Step two — task classification. Each task is classified along three dimensions. The first is task type — information-processing, communication, judgment, physical, or supervisory. The second is task context — whether the task is client-facing, internal, regulated, or safety-critical. The third is task importance — what fraction of the role’s contribution comes from this task. Importance matters because a highly-exposed task that is a small fraction of the role produces a very different workforce implication than a highly-exposed task that is most of the role.

Step three — exposure rating. Each task is rated for exposure using an explicit rubric. A serviceable expert rubric uses a five-point scale: 0 (not exposed), 1 (peripherally exposed), 2 (assistance feasible), 3 (co-production feasible), 4 (largely automatable), 5 (fully automatable). Raters work in pairs to reduce individual bias; disagreements are adjudicated by a third rater. This mirrors the Eloundou methodology’s approach to reducing rater noise.3

Step four — aggregation. Task-level ratings are aggregated into a role-level score, weighted by task importance. A role’s aggregate exposure is the importance-weighted average of its task exposures. The aggregation must not be presented as a single decimal figure. Expert practice reports aggregate exposure as a band — low (weighted average ≤1.5), moderate (1.5–2.5), high (2.5–3.5), very high (>3.5) — with the underlying task distribution visible.

Step five — uncertainty annotation. Every aggregate score carries an explicit uncertainty annotation. Which tasks had high rater disagreement? Which tasks depend on capabilities that are currently speculative? What is the score’s sensitivity to the top three tasks? Without uncertainty annotation the score will be over-interpreted.

[DIAGRAM: HubSpokeDiagram — role-decomposition-exposure-scoring — central hub “Role” with spokes for each constituent task, each spoke color-coded by exposure band (low-moderate-high-very-high) and annotated with task importance (as spoke thickness). Primitive teaches decomposition-aggregation-uncertainty as a single visual artefact.]

What the score is and is not

The aggregate exposure score is an input to the orientation decision of Article 3 and to the role-redesign decision of Article 25. It is not a verdict. Four rules keep the score honest.

The score never decides redundancy. A high exposure score means that task-level redesign is feasible and likely, not that the role is redundant. Klarna’s 2024 moderation of its aggressive AI customer-service automation position illustrates the failure mode when exposure scoring is read as redundancy determination.8 IBM’s 2023 public pause on back-office hiring, and the subsequent moderation of the stated rationale, illustrates a related failure mode in public communication.9

The score never substitutes for regulatory analysis. A workflow’s regulatory treatment (Article 3 criterion five) is independent of its exposure. High-exposure workflows under GDPR Article 22 or EU AI Act Article 6 high-risk categories still require human decision responsibility.1011

The score never substitutes for the human-value analysis. The tacit knowledge and coordination work employees contribute is systematically under-scored because it is hard to articulate at the task level. Expert practice explicitly asks incumbents “what do you do that is not in the task list” and captures the answer as a qualitative annotation on the aggregate score.

The score has a shelf life. Exposure scores calculated on 2025 model capabilities age as model capabilities evolve. Rescore annually at a minimum; rescore sooner when a material model-capability step-change occurs. Singapore’s SkillsFuture-supported national skills intelligence operates on a rolling basis for exactly this reason.12

Avoiding false precision

A recurring error is over-claiming the precision of the aggregate score. A score of 2.74 is not meaningfully different from a score of 2.61. Reporting as bands (low/moderate/high/very-high) prevents leaders from over-acting on noise. Comparison across roles should be done within bands, not across decimal differences.

A second error is conflating task exposure with economic value. An exposed task may be exposed and valuable, exposed and incidental, unexposed and valuable, or unexposed and incidental. The role-redesign work of Articles 24 and 25 uses the task decomposition to reason about value as well as exposure; the exposure score alone is not sufficient.

A third error is treating exposure as a single number per role. Roles with high task variance — where some tasks are highly exposed and others are not — behave very differently from roles with uniform exposure. A medical physician role has uniformly low exposure on bedside patient examination but high exposure on literature review and discharge-summary drafting; treating the role as a single number would mask the redesign opportunity in the high-exposure tasks. Expert practice reports task-distribution alongside the aggregate.

Communicating scores to different audiences

Exposure scores are communicated differently to different audiences.

To the board, exposure is reported at the portfolio level — how many roles sit in each band, what trendline is observed against prior quarter, what material changes have occurred. Scores are accompanied by the qualitative commentary. Boards are sophisticated readers of portfolio data but unfamiliar with exposure methodology; the briefing includes a one-page methodology summary and its limits.

To managers of affected roles, exposure is reported at the role level with task decomposition visible. Managers are the coaching front line and need to see which tasks in their team’s roles are the focus of redesign. Article 28’s manager enablement curriculum equips them to read the data.

To employees, the most common expert-practitioner mistake is to communicate exposure scores at all. Individual employees do not benefit from a score attached to their role; the score produces anxiety without actionable information. What employees benefit from is honest communication about the orientation decision (Article 3), the role-redesign plan (Article 25), and the support available during transition. The underlying exposure analysis is the reason for the communication, not its content.

To works councils and unions, the exposure scoring is shared as methodological documentation and as input to consultation. Transparency on methodology typically builds trust; attempts to keep the methodology proprietary typically erode it. Germany’s IG Metall and VW works-council AI negotiations are a documented pattern where methodological transparency on AI impact has been a point of productive engagement.13

[DIAGRAM: Matrix — exposure-communication-matrix — rows: audience (board, managers, employees, works council/union). Columns: report format, level of aggregation, accompanying qualitative commentary, action enabled. Primitive teaches that communication of the score varies systematically with audience.]

A short applied example

A large healthcare network commissions exposure scoring across 28 role families. The methodology uses O*NET as the starting point, adapted with clinician input. Raters are drawn from operations, clinical leadership, and an external advisory panel. Aggregate scores are reported as bands. The score for the primary-care physician role comes out as moderate with wide task variance — bedside examination tasks are not exposed; literature review, care-plan drafting, and discharge-summary composition are highly exposed. The organisation does not publish the aggregate score externally. It does use the task-level decomposition to design a pilot of AI-assisted discharge-summary drafting in approver configuration (Article 2). The UK NHS AI Lab’s ongoing workforce programmes provide comparable reference case material for healthcare-specific exposure analysis.14 Workforce sentiment through the pilot is tracked via Glint or Peakon pulse surveys alongside clinical quality and time-saved metrics.

Exposure scoring touches employment-law and data-protection boundaries the expert practitioner must respect.

Under GDPR Article 22, decisions about employees that are “based solely on automated processing” and produce legal or similarly significant effects are restricted; exposure scoring that feeds directly into individual decisions about specific employees without human review can fall within this restriction.15 Under the EU AI Act, workforce-management systems used in decisions about employment, career progression, and performance are classified as high-risk under Annex III and subject to Article 9, 10, and 14 obligations including risk management, data governance, and human oversight.11 Expert practice runs exposure scoring as input to human-led redesign decisions rather than as automated decision-making, and documents the human-in-the-loop governance explicitly.

Under anti-discrimination law in most jurisdictions, exposure scoring that produces disparate impact across protected populations invites legal challenge even without discriminatory intent. The expert practitioner tests the scoring methodology for disparate impact and documents the testing. Union and works-council jurisdictions add further consultation obligations before scoring is used operationally.

The ethical boundary is the distinction between scoring roles and scoring people. Roles do not have dignity concerns; people do. The scoring methodology applies to roles, and the resulting information informs role-redesign. Individual employees are supported through redesign with the dignity and support that Articles 26 and 27 cover.

Expert habits around scoring

Three expert habits separate sound practice from brittle practice.

The first is publication discipline. The methodology is published internally so anyone affected can scrutinise it. Scores are not leaked informally before the structured communication. The difference between a score appearing in a leaked spreadsheet and the same score appearing in a structured briefing is enormous for organisational trust.

The second is adversarial review. Before scores are used in any decision, a structured adversarial review asks where the scoring is wrong — which tasks might be mis-decomposed, which raters might be miscalibrated, which roles might be systematically misread. The practitioner runs the review themselves or invites a peer to run it. Adversarial review frequently reveals two or three material errors that would otherwise be carried forward.

The third is triangulation with sentiment. Exposure scores triangulate with employee sentiment data gathered through Qualtrics, CultureAmp, Peakon, or Glint. When the exposure score says a role is moderately exposed but the population sentiment reads as crisis, something in the communication or the redesign is mis-calibrated. When the sentiment reads as equanimous but the exposure score is very high, either the analysis is understated or the communication has been insufficient. Triangulation catches both failure modes.

Summary

Role exposure scoring is a disciplined quantitative input to workforce transformation, not a verdict on which roles survive. Five methodological steps — decompose, classify, rate, aggregate, annotate uncertainty — produce a defensible score. Four rules keep it honest: the score never decides redundancy, never substitutes for regulatory analysis, never substitutes for human-value analysis, and has a shelf life. Communication is calibrated by audience. Three expert habits — publication discipline, adversarial review, sentiment triangulation — separate sound practice from brittle practice. Article 5 takes the task-level analysis one step further into skills-adjacency mapping, the foundation for the redeployment work that Article 9 will extend.


Cross-references to the COMPEL Core Stream:

  • EATF-Level-1/M1.6-Art08-Workforce-Redesign-and-Human-AI-Collaboration.md — workforce redesign foundation
  • EATF-Level-1/M1.6-Art03-Building-the-AI-Talent-Pipeline.md — pipeline context for exposure reading
  • EATE-Level-3/M3.2-Art06-Talent-Strategy-at-Enterprise-Scale.md — enterprise-scale talent strategy using exposure

Q-RUBRIC self-score: 90/100

© FlowRidge.io — COMPEL AI Transformation Methodology. All rights reserved.

Footnotes

  1. World Economic Forum, Future of Jobs Report 2025 (January 2025), https://www.weforum.org/reports/the-future-of-jobs-report-2025/ (accessed 2026-04-19).

  2. OECD, “AI, Data, and the Future of Skills” working-paper series, https://www.oecd.org/employment/future-of-work/ (accessed 2026-04-19).

  3. Eloundou, T., Manning, S., Mishkin, P., Rock, D., “GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models”, arXiv 2303.10130 (March 2023), https://arxiv.org/abs/2303.10130 (accessed 2026-04-19). 2

  4. Gmyrek, P., Berg, J., Bescond, D., “Generative AI and Jobs: A Global Analysis of Potential Effects on Job Quantity and Quality”, ILO Working Paper 96 (August 2023, 2024 update), https://www.ilo.org/publications/generative-ai-and-jobs-global-analysis-potential-effects-job-quantity-and (accessed 2026-04-19).

  5. Frey, C.B., Osborne, M.A., “The Future of Employment: How Susceptible Are Jobs to Computerisation?” (Oxford Martin, 2013; published 2017), https://www.sciencedirect.com/science/article/pii/S0040162516302244 (accessed 2026-04-19).

  6. US Department of Labor, “O*NET Occupational Information Network”, https://www.onetonline.org/ (accessed 2026-04-19).

  7. European Commission, “European Skills, Competences, Qualifications and Occupations (ESCO)”, https://esco.ec.europa.eu/ (accessed 2026-04-19).

  8. Bloomberg, “Klarna Rehires Human Staff After Axing Customer Service Agents for AI” (26 November 2024), https://www.bloomberg.com/news/articles/2024-11-26/klarna-rehires-human-staff-after-axing-cx-agents-for-ai (accessed 2026-04-19).

  9. Bloomberg, “IBM to Pause Hiring for Jobs That AI Could Do” (1 May 2023), https://www.bloomberg.com/news/articles/2023-05-01/ibm-to-pause-hiring-for-back-office-jobs-that-ai-could-kill (accessed 2026-04-19).

  10. Regulation (EU) 2016/679 (“GDPR”), Article 22, https://eur-lex.europa.eu/eli/reg/2016/679/oj (accessed 2026-04-19).

  11. Regulation (EU) 2024/1689 (“EU AI Act”), Article 6 and Annex III, https://eur-lex.europa.eu/eli/reg/2024/1689/oj (accessed 2026-04-19). 2

  12. SkillsFuture Singapore, “Skills Demand for the Future Economy” (2024), https://www.skillsfuture.gov.sg/ (accessed 2026-04-19).

  13. European Commission, “Industrial Relations Report 2024” — works council coverage of AI adoption, https://op.europa.eu/ (accessed 2026-04-19).

  14. UK NHS AI Lab, https://transform.england.nhs.uk/ai-lab/ (accessed 2026-04-19).

  15. Regulation (EU) 2016/679 (“GDPR”), Article 22 (Automated individual decision-making), https://eur-lex.europa.eu/eli/reg/2016/679/oj (accessed 2026-04-19).