AI Resilience Report Scoring Methodology

Overview

The AI Resilience Report's scoring model aims to measure occupational AI resilience, which we define as the degree to which a career continues to offer sustained economic opportunity, employer demand, and meaningful human contribution as AI transforms tasks. Our scoring does not aim to predict unemployment risk, but rather how and to what degree AI is impacting or transforming occupations.

We derive AI resilience scores by combining several sources, including:

Occupational AI exposure datasets from Microsoft, Anthropic, and Will Robots Take My Job
An internal task-level exposure dataset built using deep research LLMs
Employment projections from the U.S. Bureau of Labor Statistics (BLS)

We use a multi-source ensemble because research finds that no single AI exposure measure reliably captures labor disruption on its own; combining multiple measures substantially improves predictive performance relative to baseline models in empirical labor-outcome analyses (Frank et al., 2025).

Our scoring and analysis culminates in three resilience categories: Stable, Evolving, and Changing Fast. These categories are designed to help users navigate their career trajectory.

Career Database Structure & Coverage

To provide comprehensive coverage across the U.S. labor market, our database utilizes the Standard Occupational Classification (SOC) system. This structure allows us to deliver resilience insights for 1,597 unique career entries.

Database Hierarchy

We organize these entries into a four-level hierarchy to ensure every role, from broad categories to niche specialties, is accounted for:

23 Major Groups (e.g., Management Occupations)
98 Minor Groups (e.g., Top Executives)
460 Broad Occupations (e.g., Chief Executives)
1,016 Detailed Occupations (e.g., Chief Sustainability Officers)

Data Confidence & Source Coverage

While our internal model maintains 100% coverage across all 1,597 careers, external data density varies by source. By utilizing our hierarchical "roll-up" logic, we ensure that even roles with limited external data inherit reliable estimates from their parent categories.

Current Data Density:

98.4% of careers include BLS labor market data (1,571 roles)
94.1% of careers have at least one external AI risk assessment (1,503 roles)
90.2% of careers are supported by two or more AI sources (1,433 roles)
64.1% of careers feature all three external AI sources (1,024 roles)

Integration Methodology

1. Two-Component Model

AI resilience scores are derived from two primary dimensions that balance AI-driven occupational transformation with real-world labor market demand:

AI Task Impact Score (70% weight): Measures how AI transforms the work itself, capturing "meaningful human contribution" as tasks evolve. This component integrates observed usage data from real-world AI tool interactions (Microsoft, Anthropic) with task-level research on automation and augmentation potential (WRTMJ, internal model).
Employer Demand Forecast (30% weight): Measures the labor market outlook, capturing "sustained economic opportunity". This component is derived from U.S. Bureau of Labor Statistics (BLS) 10-year employment projections.

AI Resilience Score = (AI Task Impact Score × 0.70) + (Employer Demand Forecast × 0.30)

2. AI Task Resilience Score

To ensure fair contribution from sources with different scoring scales, we apply percentile normalization. Every raw score is converted to a 0–100 percentile rank based on its position within the full SOC database:

AI Impact Analysis v1.0 (internal model): 0.1-0.72 range → 0-100 percentile
Anthropic Economic Index: 0-0.295 range → 0-100 percentile
Will Robots Take My Job (WRTMJ): 0-1.0 range → 0-100 percentile
Microsoft AI Applicability: 0-0.49 range → 0-100 percentile

Calculation Logic: The AI Task Resilience Score is the weighted average of these percentiles. For models that measure exposure (Anthropic, Microsoft, WRTMJ), the percentile is "flipped" (100 - score) to represent resilience rather than risk (e.g., transforming a "90% risk" into a "10% resilience" score). All available percentiles are weighted equally; data freshness weighting (see below) is included when relevant.

AI Task Resilience Score = Σ(Flipped Percentile × Freshness Weight) / Σ(Freshness Weights)

3. Data Freshness Weighting

Because AI capabilities evolve rapidly, we apply a "freshness discount" to our AI exposure data sources. Newer research is given higher weight, while data older than 24 months is excluded from the calculation to maintain accuracy:

Data Age	Weight
0-3 months	100%
3-6 months	95%
6-9 months	90%
9-12 months	85%
12-15 months	80%
15-18 months	75%
18-21 months	70%
21-24 months	65%
24+ months	Excluded

4. Hierarchical Coverage

To ensure comprehensive insights across all 1,597 career entries in our database, we utilize a hierarchical "roll-up" logic based on the Standard Occupational Classification (SOC) system. This allows us to provide reliable resilience scores even when specific data is unavailable at the most granular level:

Direct Scoring: Detailed occupations (e.g., "Chief Sustainability Officers") receive scores directly from available sources.
Weighted Aggregation: For broader categories, scores are calculated by averaging the data from their "child" occupations:
- Broad Groups average scores from Detailed Occupations.
- Minor Groups average scores from Broad Groups.
- Major Groups average scores from Minor Groups.

For Bureau of Labor Statistics (BLS) data specifically, we utilize three sourcing methods:

Direct Match (1,250 careers): Matched via the official SOC code crosswalk.
Aggregated (286 careers): Parent-level data calculated by summing or averaging child occupation statistics.
Inherited (35 careers): Niche detailed occupations inherit data from their nearest "parent" category with available BLS projections.

5. Current Source Update Frequencies

To maintain the precision of our Data Freshness Weighting, we track the specific update cadence and "Last Updated" status for each component of the ensemble:

AI Impact Analysis v1.0 (internal model): Updated quarterly.
Will Robots Take My Job (WRTMJ): Updated quarterly.
Anthropic Economic Index: Updated periodically (Last: January 2026).
Microsoft AI Applicability: Updated periodically (Last: July 2025).

6. Resilience Categorization

The final integrated percentile determines the career's resilience category:

Stable
70th percentile and above. These roles are expected to remain steady over time, with AI supporting rather than replacing the core work.
Evolving
30th–69th percentile. These roles are shifting as AI becomes part of everyday workflows. Workers should expect new responsibilities and opportunities as AI augments their primary functions.
Changing Fast
Below 30th percentile. These roles are undergoing rapid transformation. Entry-level tasks may be automated, and traditional career paths may look different in the near future.

Data Sources

AI resilience scores are an ensemble metric that synthesizes internal research with several industry-leading external datasets.

1. Internal Model: AI Impact Analysis v1.0

Our proprietary model is designed to provide a high-fidelity, research-backed assessment of AI's impact on the workforce. While many models rely on broad statistical trends, our model performs deep, task-level research to identify where AI is truly automating work versus where it is augmenting human capability.

The Six-Step Methodology

We execute a series of automated scripts every quarter to ensure our findings reflect the latest advancements in AI:

Task Scoring: All 20,000+ O*NET career tasks are scored for automation likelihood using an LLM.
Task Selection for Automation Extremes: For every career, we select up to six core tasks, prioritizing those at the extreme ends of the automation spectrum to find examples of both high automation and high human-centric stability.
Grounding with Deep Research: We utilize OpenAI's deep_research API to search peer-reviewed journals, industry publications, and research institutions for documented evidence of AI impact on those specific tasks.
Scoring Research Findings: The research data is synthesized through dual LLM prompts to generate independent scores for automation and augmentation.
Calculating Top-Level Score: We apply dynamic weighting to combine these scores, where the influence of augmentation is reduced in roles with extremely high automation.
Percentile Mapping: Raw scores are mapped to percentiles (0-100) to allow for consistent comparison against external datasets and other careers.

Theoretical Foundations

Our methodology builds on established research in labor economics and AI impact assessment:

Task-Level Analysis: We analyze occupations at the task level rather than treating jobs as monolithic units. This approach originates from OECD research on skill-based technological change and was further developed in subsequent automation studies.
Automation and Augmentation Duality: We assess AI resilience through the dual lenses of automation (AI replacing human tasks) and augmentation (AI enhancing human capabilities). While we first encountered this framework in Anthropic's Economic Index, the theoretical foundation traces to the Stanford Digital Economy Lab's work on AI's labor market effects.
Dynamic Weighting: Our scoring applies dynamic weights where higher automatability reduces the influence of augmentation on the final resilience score. This reflects Acemoglu & Restrepo's insight that "the net impact of automation on labor demand depends on how the displacement and productivity effects weigh against each other."
Bottleneck Analysis: We examine both the most and least automatable tasks within each career. Analyzing least-automatable tasks reflects the "bottleneck concept" from Frey & Osborne's Oxford study: "These bottlenecks set the boundaries for the computerisation of non-routine tasks."
AI Adoption Factors: Our research framework for assessing task-level automation and augmentation potential draws from McKinsey Global Institute's workforce automation analysis and the Stockholm School of Economics Institute for Research on labor substitution.

Model Limitations

While robust, we maintain transparency regarding the following limitations:

Selective Task Assessment: We focus on extremes (most/least automatable), which may leave some intermediate "middle-ground" tasks unassessed.
Equal Task Weighting: All core tasks are treated with equal importance, though some may realistically occupy more of a worker's time than others.
Research Constraints: Findings are bounded by what is publicly documented and searchable on the web; undocumented workplace shifts may not be captured.
Non-Deterministic Scoring: LLM-based scoring can introduce minor variability between research cycles. We test variability by running the same 50 careers through our scoring pipeline 5 times. Average standard deviation across all careers remains small:
- Average variability for Automation results: 1.32% (range: 0.00%-4.90%)
- Average variability for Augmentation results: 1.53% (range: 0.00%-6.32%)
- Average variability for Weighted results: 1.23% (range: 0.00%-5.06%)

We're committed to openness about our methodology. If you'd like access to our prompts, raw data, or additional details, contact us at air@careervillage.org.

2. External AI Exposure Sources

We integrate three external perspectives to capture both theoretical risk and real-world usage:

Anthropic Economic Index

Based on analysis of real Claude AI conversations, this dataset provides:

Task-level AI scores measuring automation and augmentation potential across five key dimensions
Task prevalence data showing how much time workers spend on different activities
Occupation mappings linking tasks to specific careers using O*NET classifications
Coverage: 62.6% of careers in our database (1,000 out of 1,597 careers)

Microsoft AI Applicability

Based on Bing Copilot usage data from January-September 2024, Microsoft provides:

AI usefulness scores measuring real-world AI application in different roles
Usage-based insights from actual workplace AI interactions
Coverage: 83.0% of careers in our database (1,326 out of 1,597 careers)

Will Robots Take My Job (WRTMJ)

Inspired by Oxford University research on automation, WRTMJ offers:

AI risk scores built off a trained regression model + user polling
Focus on computerization susceptibility based on task characteristics
Coverage: 90.2% of careers in our database (1,440 out of 1,597 careers)

3. Labor Market Outlook

To ground our AI-specific findings in broader labor market realities, we incorporate traditional economic forecasting using employment projections from the U.S. Bureau of Labor Statistics (BLS), specifically:

Employment growth rate measuring projected change in total jobs (2024-2034)
Annual job openings measuring average yearly opportunities from growth and replacement needs
Coverage: 98.4% of careers in our database (1,571 out of 1,597 careers, excluding military occupations)

Calculation Logic: The Labor Market Outlook is calculated by synthesizing two key metrics from the Bureau of Labor Statistics into a single, unified measure. First, the growth percentile is determined by converting the projected employment growth rate into a percentile rank across all evaluated careers. Second, the annual openings percentile is established by ranking the total number of yearly job openings in the same manner. These two values are then averaged equally to produce the final combined score.

Employer Demand Forecast = (Growth Percentile + Annual Openings Percentile) / 2

Confidence Score

Each career's resilience score includes a confidence indicator that reflects how reliable the assessment is. Confidence is based on two factors: how many data sources we have for that career, and how closely those sources agree with each other.

Calculation Formula

Confidence Score = Source Count Points (0-50) + Agreement Points (0-50)

Source Count Points

More data sources provide a more robust assessment:

Sources Available	Points
4 sources (CV + Anthropic + Microsoft + WRTMJ)	50 pts
3 sources	25 pts
2 sources	10 pts
1 source	0 pts

Agreement Points (Standard Deviation)

When sources agree closely, we have higher confidence in the score. We measure agreement using the standard deviation of percentile scores across available sources:

Standard Deviation	Agreement Level	Points
≤ 0.129	High agreement	50 pts
≤ 0.197	Moderate agreement	25 pts
≤ 0.267	Some disagreement	10 pts
> 0.267	High disagreement	0 pts

SD thresholds are based on quartile analysis of 1,505 careers with multiple data sources.

Confidence Levels

The total score (0-100) maps to five confidence levels:

Score Range	Confidence Level
80-100	High
60-79	Medium-high
40-59	Medium
20-39	Low-medium
0-19	Low

Interpretation

High confidence: Multiple sources available and they largely agree—the resilience score is well-supported.
Medium confidence: Either fewer sources or moderate disagreement between them—the score provides useful guidance but has some uncertainty.
Low confidence: Limited data or significant disagreement—interpret the score with caution and consider it directional rather than definitive.

Limitations

While our methodology is designed to provide reliable, research-backed insights, we maintain transparency about its structural constraints:

Relative vs. Absolute Measurement: Percentile normalization ensures fair comparison across sources with different scales, but it measures relative resilience rather than absolute AI impact. By design, 30% of careers will always fall into "Changing Fast" regardless of whether AI is transforming a small or large share of the labor market. This approach excels at ranking careers against each other but cannot capture economy-wide shifts in AI exposure over time.
Source Correlation: The ensemble approach assumes independent signals across data sources. However, all four AI exposure models likely share underlying theoretical assumptions about what makes work automatable (e.g., cognitive vs. manual tasks, routine vs. non-routine work). This correlation may reduce the diversification benefit compared to truly independent assessments.
Temporal Misalignment: Our sources operate on different update cadences and reflect different points in time. While data freshness weighting adjusts for recency, we are inherently blending snapshots of AI capabilities from different moments—moments during which the technology itself is rapidly evolving.
U.S.-Specific Framing: Our methodology relies on Standard Occupational Classification (SOC) codes and Bureau of Labor Statistics projections, limiting applicability to the U.S. labor market. AI's occupational impact likely varies across economies with different labor structures, automation adoption rates, and regulatory environments.
Within-Occupation Variance: SOC-level analysis provides occupational averages but cannot capture how AI exposure varies within the same occupation across industries, employer types, or geographies. A financial analyst at a fintech startup faces different AI dynamics than one at a regional credit union; a truck driver on long-haul rural routes encounters different automation timelines than one navigating dense urban deliveries.
Occupational Mapping and Taxonomy Error: Our methodology depends on mapping between standardized occupational taxonomies (e.g., SOC codes), task libraries (e.g., O*NET), and third-party occupation definitions used by external AI exposure datasets. Because many roles—especially emerging, hybrid, or highly specialized careers—do not align cleanly across these systems, mapping ambiguity or crosswalk errors can propagate into the final score and category assignment.

References

Acemoglu, D., & Restrepo, P. (2019). Automation and New Tasks: How Technology Displaces and Reinstates Labor. Journal of Economic Perspectives, 33(2), 3-30. aeaweb.org
Anthropic. (2026). Anthropic Economic Index: Task-Level AI Exposure and Prevalence Data. anthropic.com
Brynjolfsson, E., Li, D., & Raymond, L. (2022). Augmentation, Tasks, and Wages: How AI and Advances in Technology Affect the Labor Market. Stanford Digital Economy Lab. arxiv.org
Bureau of Labor Statistics. (2024). Employment Projections: 2024-2034 Occupational Outlook. U.S. Department of Labor. bls.gov
Frank, M. R., Ahn, Y.-Y., & Moro, E. (2025). AI exposure predicts unemployment risk: A new approach to technology-driven job loss. PNAS Nexus, 4(4), pgaf107. academic.oup.com
Frey, C. B., & Osborne, M. A. (2017). The Future of Employment: How Susceptible Are Jobs to Computerisation? Technological Forecasting and Social Change, 114, 254-280. oxfordmartin.ox.ac.uk
McKinsey Global Institute. (2017). A Future That Works: Automation, Employment, and Productivity. mckinsey.com
Nedelkoska, L., & Quintini, G. (2018). Automation, Skills Use and Training. OECD Social, Employment and Migration Working Papers, No. 202. oecd-ilibrary.org
Teigland, R., van der Zande, J., Teigland, K., & Siri, S. (2018). The Substitution of Labor: From Technological Feasibility to Other Factors Influencing Job Automation. Stockholm School of Economics Institute for Research. ssrn.com
Tomlinson, K., Jaffe, S., Wang, W., Counts, S., & Suri, S. (2025). Working with AI: Measuring the Applicability of Generative AI to Occupations. arxiv.org
Will Robots Take My Job. (2026). Occupational Automation Risk Analysis and User Sentiment Index. willrobotstakemyjob.com

Are we missing a key data source? Email us at air@careervillage.org.