This site is a work in progress and has not been widely shared. Content may contain errors. Feedback is welcome.
This site is undergoing review. Some annotations were human-generated, some AI-generated — all are being verified.
Back to papers

AI and Jobs: A Review of Theory, Estimates, and Evidence

del Rio-Chanona, Ernst, Merola, Samaan, Teutloff

2025arXiv pre-print
Review / survey / meta-analysisInterdisciplinaryTheoretical model
AI (General)LLM / Generative AIMachine Learning (pre-LLM)Junior / entry-levelHuman-AI collaborationAugmentation vs. substitutionRoutine task changeGeneral automationPlatforms / gig economySoftware / codingWriting / contentCustomer service
Abstract

Generative AI is altering work processes, task composition, and organizational design, yet its effects on employment and the macroeconomy remain unresolved. In this review, we synthesize theory and empirical evidence at three levels. First, we trace the evolution from aggregate production frameworks to task- and expertise-based models. Second, we quantitatively review and compare (ex-ante) AI exposure measures of occupations from multiple studies and find convergence towards high-wage jobs. Third, we assemble ex-post evidence of AI's impact on employment from randomized controlled trials (RCTs), field experiments, and digital trace data (e.g., online labor platforms, software repositories), complemented by partial coverage of surveys. Across the reviewed studies, productivity gains are sizable but context-dependent: on the order of 20 to 60 percent in controlled RCTs, and 15 to 30 percent in field experiments. Novice workers tend to benefit more from LLMs in simple tasks. Across complex tasks, evidence is mixed on whether low or high-skilled workers benefit more. Digital trace data show substitution between humans and machines in writing and translation alongside rising demand for AI, with mild evidence of declining demand for novice workers. A more substantial decrease in demand for novice jobs across AI complementary work emerges from recent studies using surveys, platform payment records, or administrative data. Research gaps include the focus on simple tasks in experiments, the limited diversity of LLMs studied, and technology-centric AI exposure measures that overlook adoption dynamics and whether exposure translates into substitution, productivity gains, erode or increase expertise.

Summary

del Rio-Chanona, Ernst, Merola, Samaan, and Teutloff conduct a comprehensive literature review synthesizing theoretical frameworks (growth theory, task-based models, collective intelligence), quantitatively comparing AI exposure measures across multiple studies, and systematically reviewing ex-post evidence from experiments and observational data to understand how generative AI affects employment, productivity, and inequality across occupations and skill levels

Main Finding

Productivity gains from generative AI are substantial but context-dependent (20-60% in controlled RCTs, 15-30% in field experiments), with novice workers benefiting more from AI on simple tasks, but mixed evidence on complex tasks; digital trace data shows substitution effects in writing/translation alongside rising demand for AI-specific skills, with emerging evidence of reduced demand for novice workers in AI-complementary roles from surveys and administrative data, though effects remain modest in representative samples

Primary Datasets

O*NET; Various exposure measures; RCT evidence; Administrative data

Secondary Datasets

STEP Survey; PIAAC; Patent data; Freelancing platform data

Key Methods
Comprehensive literature review synthesizing theoretical frameworks (growth theory, task-based models, collective intelligence), quantitative comparison of AI exposure measures across multiple studies, systematic review of experimental evidence (RCTs, field experiments, natural experiments), and analysis of digital trace data from online labor platforms
Sample Period
Literature review covering 2018-2025
Geographic Coverage
Global (primarily US; cross-country exposure estimates for 103 countries)
Sample Size
Reviews approximately 70+ primary empirical studies; quantitative exposure analysis covers 758 SOC occupations; specific studies reviewed include samples ranging from dozens to millions of observations
Level of Analysis
Individual, Firm, Occupation, Task
Occupation Classification
SOC, ISCO
Industry Classification
Various (studies use NAICS, industry-specific classifications)
Notes
arXiv:2509.15265; comprehensive review integrating theory (CES production functions, Acemoglu-Restrepo task model), ex-ante exposure measures (expert, patent, LLM-based), and ex-post evidence (RCTs, natural experiments, administrative data); introduces simple vs complex task classification; covers collective intelligence and team performance; ILO-affiliated authors [Claude classification]: ILO-affiliated review paper. Introduces novel classification distinguishing simple vs complex tasks using four dimensions: knowledge requirements, clarity of goal, interdependence, and context requirements. Reviews three levels of evidence: (1) theoretical frameworks from production functions to task-based models; (2) ex-ante AI exposure measures using expert assessments, patent data, and LLM-based methods; (3) ex-post evidence from RCTs, field experiments, natural experiments, and digital trace data. Quantitatively compares multiple AI exposure measures at SOC occupation level. Identifies key research gaps including limited focus on complex tasks in experiments, technologically deterministic exposure measures that ignore adoption dynamics, and limited AI model diversity in studies. Papers reviewed span economics, computer science, management, and sociology disciplines [Claude classification]: ILO-affiliated review paper. Introduces novel classification distinguishing simple vs complex tasks using four dimensions: knowledge requirements, clarity of goal, interdependence, and context requirements. Reviews three levels of evidence: (1) theoretical frameworks from production functions to task-based models; (2) ex-ante AI exposure measures using expert assessments, patent data, and LLM-based methods; (3) ex-post evidence from RCTs, field experiments, natural experiments, and digital trace data. Quantitatively compares multiple AI exposure measures at SOC occupation level. Identifies key research gaps including limited focus on complex tasks in experiments, technologically deterministic exposure measures that ignore adoption dynamics, and limited AI model diversity in studies. Papers reviewed span economics, computer science, management, and sociology disciplines [Claude classification]: ILO-affiliated review paper. Introduces novel classification distinguishing simple vs complex tasks using four dimensions: knowledge requirements, clarity of goal, interdependence, and context requirements. Reviews three levels of evidence: (1) theoretical frameworks from production functions to task-based models; (2) ex-ante AI exposure measures using expert assessments, patent data, and LLM-based methods; (3) ex-post evidence from RCTs, field experiments, natural experiments, and digital trace data. Quantitatively compares multiple AI exposure measures at SOC occupation level. Identifies key research gaps including limited focus on complex tasks in experiments, technologically deterministic exposure measures that ignore adoption dynamics, and limited AI model diversity in studies. Papers reviewed span economics, computer science, management, and sociology disciplines [Claude classification]: ILO-affiliated review paper. Introduces novel classification distinguishing simple vs complex tasks using four dimensions: knowledge requirements, clarity of goal, interdependence, and context requirements. Reviews three levels of evidence: (1) theoretical frameworks from production functions to task-based models; (2) ex-ante AI exposure measures using expert assessments, patent data, and LLM-based methods; (3) ex-post evidence from RCTs, field experiments, natural experiments, and digital trace data. Quantitatively compares multiple AI exposure measures at SOC occupation level. Identifies key research gaps including limited focus on complex tasks in experiments, technologically deterministic exposure measures that ignore adoption dynamics, and limited AI model diversity in studies. Papers reviewed span economics, computer science, management, and sociology disciplines [Claude classification]: ILO-affiliated review paper. Introduces novel classification distinguishing simple vs complex tasks using four dimensions: knowledge requirements, clarity of goal, interdependence, and context requirements. Reviews three levels of evidence: (1) theoretical frameworks from production functions to task-based models; (2) ex-ante AI exposure measures using expert assessments, patent data, and LLM-based methods; (3) ex-post evidence from RCTs, field experiments, natural experiments, and digital trace data. Quantitatively compares multiple AI exposure measures at SOC occupation level. Identifies key research gaps including limited focus on complex tasks in experiments, technologically deterministic exposure measures that ignore adoption dynamics, and limited AI model diversity in studies. Papers reviewed span economics, computer science, management, and sociology disciplines [Claude classification]: ILO-affiliated review paper. Introduces novel classification distinguishing simple vs complex tasks using four dimensions: knowledge requirements, clarity of goal, interdependence, and context requirements. Reviews three levels of evidence: (1) theoretical frameworks from production functions to task-based models; (2) ex-ante AI exposure measures using expert assessments, patent data, and LLM-based methods; (3) ex-post evidence from RCTs, field experiments, natural experiments, and digital trace data. Quantitatively compares multiple AI exposure measures at SOC occupation level. Identifies key research gaps including limited focus on complex tasks in experiments, technologically deterministic exposure measures that ignore adoption dynamics, and limited AI model diversity in studies. Papers reviewed span economics, computer science, management, and sociology disciplines [Claude classification]: ILO-affiliated review paper. Introduces novel classification distinguishing simple vs complex tasks using four dimensions: knowledge requirements, clarity of goal, interdependence, and context requirements. Reviews three levels of evidence: (1) theoretical frameworks from production functions to task-based models; (2) ex-ante AI exposure measures using expert assessments, patent data, and LLM-based methods; (3) ex-post evidence from RCTs, field experiments, natural experiments, and digital trace data. Quantitatively compares multiple AI exposure measures at SOC occupation level. Identifies key research gaps including limited focus on complex tasks in experiments, technologically deterministic exposure measures that ignore adoption dynamics, and limited AI model diversity in studies. Papers reviewed span economics, computer science, management, and sociology disciplines [Claude classification]: ILO-affiliated review paper. Introduces novel classification distinguishing simple vs complex tasks using four dimensions: knowledge requirements, clarity of goal, interdependence, and context requirements. Reviews three levels of evidence: (1) theoretical frameworks from production functions to task-based models; (2) ex-ante AI exposure measures using expert assessments, patent data, and LLM-based methods; (3) ex-post evidence from RCTs, field experiments, natural experiments, and digital trace data. Quantitatively compares multiple AI exposure measures at SOC occupation level. Identifies key research gaps including limited focus on complex tasks in experiments, technologically deterministic exposure measures that ignore adoption dynamics, and limited AI model diversity in studies. Papers reviewed span economics, computer science, management, and sociology disciplines [Claude classification]: ILO-affiliated review paper. Introduces novel classification distinguishing simple vs complex tasks using four dimensions: knowledge requirements, clarity of goal, interdependence, and context requirements. Reviews three levels of evidence: (1) theoretical frameworks from production functions to task-based models; (2) ex-ante AI exposure measures using expert assessments, patent data, and LLM-based methods; (3) ex-post evidence from RCTs, field experiments, natural experiments, and digital trace data. Quantitatively compares multiple AI exposure measures at SOC occupation level. Identifies key research gaps including limited focus on complex tasks in experiments, technologically deterministic exposure measures that ignore adoption dynamics, and limited AI model diversity in studies. Papers reviewed span economics, computer science, management, and sociology disciplines [Claude classification]: ILO-affiliated review paper. Introduces novel classification distinguishing simple vs complex tasks using four dimensions: knowledge requirements, clarity of goal, interdependence, and context requirements. Reviews three levels of evidence: (1) theoretical frameworks from production functions to task-based models; (2) ex-ante AI exposure measures using expert assessments, patent data, and LLM-based methods; (3) ex-post evidence from RCTs, field experiments, natural experiments, and digital trace data. Quantitatively compares multiple AI exposure measures at SOC occupation level. Identifies key research gaps including limited focus on complex tasks in experiments, technologically deterministic exposure measures that ignore adoption dynamics, and limited AI model diversity in studies. Papers reviewed span economics, computer science, management, and sociology disciplines [Claude classification]: ILO-affiliated review paper. Introduces novel classification distinguishing simple vs complex tasks using four dimensions: knowledge requirements, clarity of goal, interdependence, and context requirements. Reviews three levels of evidence: (1) theoretical frameworks from production functions to task-based models; (2) ex-ante AI exposure measures using expert assessments, patent data, and LLM-based methods; (3) ex-post evidence from RCTs, field experiments, natural experiments, and digital trace data. Quantitatively compares multiple AI exposure measures at SOC occupation level. Identifies key research gaps including limited focus on complex tasks in experiments, technologically deterministic exposure measures that ignore adoption dynamics, and limited AI model diversity in studies. Papers reviewed span economics, computer science, management, and sociology disciplines