This site is a work in progress and has not been widely shared. Content may contain errors. Feedback is welcome.
This site is undergoing review. Some annotations were human-generated, some AI-generated — all are being verified.
Back to papers

No Great Equalizer: Experimental Evidence on Productivity Effects of Generative AI Use in the UK Labor Market

Haslberger, Gingrich, Bhatia

2023SSRN Electronic Journal18 citations
Experimental evidenceInterdisciplinaryCausal
LLM / Generative AIWriting / contentJunior / entry-levelSenior / older workersGenderHuman-AI collaborationAugmentation vs. substitution
Summary

Haslberger, Gingrich, and Bhatia conduct a pre-registered online randomized experiment with 1,041 UK workers to study how ChatGPT exposure affects productivity and inequality across different tasks, demographic groups, and skill levels.

Main Finding

Exposure to ChatGPT significantly improved task performance (0.41-0.92 SD) and reduced completion time (0.32-0.69 SD) across all worker groups, but benefits were greatest for younger workers, with no compression of inequality between educational or occupational groups and increased age-based inequality.

Primary Datasets

Original online experiment conducted via YouGov (July 2023, N=1,041)

Secondary Datasets

None

Key Methods
Pre-registered online randomized experiment with treatment group encouraged to use ChatGPT and control group discouraged from AI use; three text-based tasks of varying complexity; AI-assisted grading using GPT-4; intention-to-treat analysis with compliance checks
Sample Period
2023
Geographic Coverage
UK
Sample Size
1,041 UK working-age adults (504 treatment, 537 control)
Level of Analysis
Individual
Occupation Classification
None
Industry Classification
None
Notes
SSRN Electronic Journal [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality.